Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactions
Motivation: A central goal of current biology is to establish a complete functional link between the genotype and phenotype, known as the so-called genotype–phenotype map. With the continuous development of high-throughput technology and the decline in sequencing costs, multi-omics analysis has beco...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2022-08-01
|
Series: | Frontiers in Genetics |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fgene.2022.921775/full |
_version_ | 1811321306796261376 |
---|---|
author | Xinpeng Guo Xinpeng Guo Jinyu Han Yafei Song Zhilei Yin Shuaichen Liu Xuequn Shang |
author_facet | Xinpeng Guo Xinpeng Guo Jinyu Han Yafei Song Zhilei Yin Shuaichen Liu Xuequn Shang |
author_sort | Xinpeng Guo |
collection | DOAJ |
description | Motivation: A central goal of current biology is to establish a complete functional link between the genotype and phenotype, known as the so-called genotype–phenotype map. With the continuous development of high-throughput technology and the decline in sequencing costs, multi-omics analysis has become more widely employed. While this gives us new opportunities to uncover the correlation mechanisms between single-nucleotide polymorphism (SNP), genes, and phenotypes, multi-omics still faces certain challenges, specifically: 1) When the sample size is large enough, the number of omics types is often not large enough to meet the requirements of multi-omics analysis; 2) each omics’ internal correlations are often unclear, such as the correlation between genes in genomics; 3) when analyzing a large number of traits (p), the sample size (n) is often smaller than p, n << p, hindering the application of machine learning methods in the classification of disease outcomes.Results: To solve these issues with multi-omics and build a robust classification model, we propose a graph-embedded deep neural network (G-EDNN) based on expression quantitative trait loci (eQTL) data, which achieves sparse connectivity between network layers to prevent overfitting. The correlation within each omics is also considered such that the model more closely resembles biological reality. To verify the capabilities of this method, we conducted experimental analysis using the GSE28127 and GSE95496 data sets from the Gene Expression Omnibus (GEO) database, tested various neural network architectures, and used prior data for feature selection and graph embedding. Results show that the proposed method could achieve a high classification accuracy and easy-to-interpret feature selection. This method represents an extended application of genotype–phenotype association analysis in deep learning networks. |
first_indexed | 2024-04-13T13:15:43Z |
format | Article |
id | doaj.art-e4766974a7784ed2baab5e22aac3cbfe |
institution | Directory Open Access Journal |
issn | 1664-8021 |
language | English |
last_indexed | 2024-04-13T13:15:43Z |
publishDate | 2022-08-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Genetics |
spelling | doaj.art-e4766974a7784ed2baab5e22aac3cbfe2022-12-22T02:45:29ZengFrontiers Media S.A.Frontiers in Genetics1664-80212022-08-011310.3389/fgene.2022.921775921775Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactionsXinpeng Guo0Xinpeng Guo1Jinyu Han2Yafei Song3Zhilei Yin4Shuaichen Liu5Xuequn Shang6School of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an, ChinaSchool of Air and Missile Defense, Air Force Engineering University, Xi’an, ChinaSchool of Economics and Management, Chang ‘an University, Xi’an, ChinaSchool of Air and Missile Defense, Air Force Engineering University, Xi’an, ChinaSchool of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an, ChinaSchool of Marine Science and Technology, Northwestern Polytechnical University, Xi’an, ChinaSchool of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an, ChinaMotivation: A central goal of current biology is to establish a complete functional link between the genotype and phenotype, known as the so-called genotype–phenotype map. With the continuous development of high-throughput technology and the decline in sequencing costs, multi-omics analysis has become more widely employed. While this gives us new opportunities to uncover the correlation mechanisms between single-nucleotide polymorphism (SNP), genes, and phenotypes, multi-omics still faces certain challenges, specifically: 1) When the sample size is large enough, the number of omics types is often not large enough to meet the requirements of multi-omics analysis; 2) each omics’ internal correlations are often unclear, such as the correlation between genes in genomics; 3) when analyzing a large number of traits (p), the sample size (n) is often smaller than p, n << p, hindering the application of machine learning methods in the classification of disease outcomes.Results: To solve these issues with multi-omics and build a robust classification model, we propose a graph-embedded deep neural network (G-EDNN) based on expression quantitative trait loci (eQTL) data, which achieves sparse connectivity between network layers to prevent overfitting. The correlation within each omics is also considered such that the model more closely resembles biological reality. To verify the capabilities of this method, we conducted experimental analysis using the GSE28127 and GSE95496 data sets from the Gene Expression Omnibus (GEO) database, tested various neural network architectures, and used prior data for feature selection and graph embedding. Results show that the proposed method could achieve a high classification accuracy and easy-to-interpret feature selection. This method represents an extended application of genotype–phenotype association analysis in deep learning networks.https://www.frontiersin.org/articles/10.3389/fgene.2022.921775/fulleQTLexpression quantitative trait locigraph-embedded deep neural networkgenotype-phenotypeSNPgene |
spellingShingle | Xinpeng Guo Xinpeng Guo Jinyu Han Yafei Song Zhilei Yin Shuaichen Liu Xuequn Shang Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactions Frontiers in Genetics eQTL expression quantitative trait loci graph-embedded deep neural network genotype-phenotype SNP gene |
title | Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactions |
title_full | Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactions |
title_fullStr | Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactions |
title_full_unstemmed | Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactions |
title_short | Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactions |
title_sort | using expression quantitative trait loci data and graph embedded neural networks to uncover genotype phenotype interactions |
topic | eQTL expression quantitative trait loci graph-embedded deep neural network genotype-phenotype SNP gene |
url | https://www.frontiersin.org/articles/10.3389/fgene.2022.921775/full |
work_keys_str_mv | AT xinpengguo usingexpressionquantitativetraitlocidataandgraphembeddedneuralnetworkstouncovergenotypephenotypeinteractions AT xinpengguo usingexpressionquantitativetraitlocidataandgraphembeddedneuralnetworkstouncovergenotypephenotypeinteractions AT jinyuhan usingexpressionquantitativetraitlocidataandgraphembeddedneuralnetworkstouncovergenotypephenotypeinteractions AT yafeisong usingexpressionquantitativetraitlocidataandgraphembeddedneuralnetworkstouncovergenotypephenotypeinteractions AT zhileiyin usingexpressionquantitativetraitlocidataandgraphembeddedneuralnetworkstouncovergenotypephenotypeinteractions AT shuaichenliu usingexpressionquantitativetraitlocidataandgraphembeddedneuralnetworkstouncovergenotypephenotypeinteractions AT xuequnshang usingexpressionquantitativetraitlocidataandgraphembeddedneuralnetworkstouncovergenotypephenotypeinteractions |