Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactions

Motivation: A central goal of current biology is to establish a complete functional link between the genotype and phenotype, known as the so-called genotype–phenotype map. With the continuous development of high-throughput technology and the decline in sequencing costs, multi-omics analysis has beco...

Full description

Bibliographic Details
Main Authors: Xinpeng Guo, Jinyu Han, Yafei Song, Zhilei Yin, Shuaichen Liu, Xuequn Shang
Format: Article
Language:English
Published: Frontiers Media S.A. 2022-08-01
Series:Frontiers in Genetics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fgene.2022.921775/full
_version_ 1811321306796261376
author Xinpeng Guo
Xinpeng Guo
Jinyu Han
Yafei Song
Zhilei Yin
Shuaichen Liu
Xuequn Shang
author_facet Xinpeng Guo
Xinpeng Guo
Jinyu Han
Yafei Song
Zhilei Yin
Shuaichen Liu
Xuequn Shang
author_sort Xinpeng Guo
collection DOAJ
description Motivation: A central goal of current biology is to establish a complete functional link between the genotype and phenotype, known as the so-called genotype–phenotype map. With the continuous development of high-throughput technology and the decline in sequencing costs, multi-omics analysis has become more widely employed. While this gives us new opportunities to uncover the correlation mechanisms between single-nucleotide polymorphism (SNP), genes, and phenotypes, multi-omics still faces certain challenges, specifically: 1) When the sample size is large enough, the number of omics types is often not large enough to meet the requirements of multi-omics analysis; 2) each omics’ internal correlations are often unclear, such as the correlation between genes in genomics; 3) when analyzing a large number of traits (p), the sample size (n) is often smaller than p, n << p, hindering the application of machine learning methods in the classification of disease outcomes.Results: To solve these issues with multi-omics and build a robust classification model, we propose a graph-embedded deep neural network (G-EDNN) based on expression quantitative trait loci (eQTL) data, which achieves sparse connectivity between network layers to prevent overfitting. The correlation within each omics is also considered such that the model more closely resembles biological reality. To verify the capabilities of this method, we conducted experimental analysis using the GSE28127 and GSE95496 data sets from the Gene Expression Omnibus (GEO) database, tested various neural network architectures, and used prior data for feature selection and graph embedding. Results show that the proposed method could achieve a high classification accuracy and easy-to-interpret feature selection. This method represents an extended application of genotype–phenotype association analysis in deep learning networks.
first_indexed 2024-04-13T13:15:43Z
format Article
id doaj.art-e4766974a7784ed2baab5e22aac3cbfe
institution Directory Open Access Journal
issn 1664-8021
language English
last_indexed 2024-04-13T13:15:43Z
publishDate 2022-08-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Genetics
spelling doaj.art-e4766974a7784ed2baab5e22aac3cbfe2022-12-22T02:45:29ZengFrontiers Media S.A.Frontiers in Genetics1664-80212022-08-011310.3389/fgene.2022.921775921775Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactionsXinpeng Guo0Xinpeng Guo1Jinyu Han2Yafei Song3Zhilei Yin4Shuaichen Liu5Xuequn Shang6School of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an, ChinaSchool of Air and Missile Defense, Air Force Engineering University, Xi’an, ChinaSchool of Economics and Management, Chang ‘an University, Xi’an, ChinaSchool of Air and Missile Defense, Air Force Engineering University, Xi’an, ChinaSchool of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an, ChinaSchool of Marine Science and Technology, Northwestern Polytechnical University, Xi’an, ChinaSchool of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an, ChinaMotivation: A central goal of current biology is to establish a complete functional link between the genotype and phenotype, known as the so-called genotype–phenotype map. With the continuous development of high-throughput technology and the decline in sequencing costs, multi-omics analysis has become more widely employed. While this gives us new opportunities to uncover the correlation mechanisms between single-nucleotide polymorphism (SNP), genes, and phenotypes, multi-omics still faces certain challenges, specifically: 1) When the sample size is large enough, the number of omics types is often not large enough to meet the requirements of multi-omics analysis; 2) each omics’ internal correlations are often unclear, such as the correlation between genes in genomics; 3) when analyzing a large number of traits (p), the sample size (n) is often smaller than p, n << p, hindering the application of machine learning methods in the classification of disease outcomes.Results: To solve these issues with multi-omics and build a robust classification model, we propose a graph-embedded deep neural network (G-EDNN) based on expression quantitative trait loci (eQTL) data, which achieves sparse connectivity between network layers to prevent overfitting. The correlation within each omics is also considered such that the model more closely resembles biological reality. To verify the capabilities of this method, we conducted experimental analysis using the GSE28127 and GSE95496 data sets from the Gene Expression Omnibus (GEO) database, tested various neural network architectures, and used prior data for feature selection and graph embedding. Results show that the proposed method could achieve a high classification accuracy and easy-to-interpret feature selection. This method represents an extended application of genotype–phenotype association analysis in deep learning networks.https://www.frontiersin.org/articles/10.3389/fgene.2022.921775/fulleQTLexpression quantitative trait locigraph-embedded deep neural networkgenotype-phenotypeSNPgene
spellingShingle Xinpeng Guo
Xinpeng Guo
Jinyu Han
Yafei Song
Zhilei Yin
Shuaichen Liu
Xuequn Shang
Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactions
Frontiers in Genetics
eQTL
expression quantitative trait loci
graph-embedded deep neural network
genotype-phenotype
SNP
gene
title Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactions
title_full Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactions
title_fullStr Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactions
title_full_unstemmed Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactions
title_short Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactions
title_sort using expression quantitative trait loci data and graph embedded neural networks to uncover genotype phenotype interactions
topic eQTL
expression quantitative trait loci
graph-embedded deep neural network
genotype-phenotype
SNP
gene
url https://www.frontiersin.org/articles/10.3389/fgene.2022.921775/full
work_keys_str_mv AT xinpengguo usingexpressionquantitativetraitlocidataandgraphembeddedneuralnetworkstouncovergenotypephenotypeinteractions
AT xinpengguo usingexpressionquantitativetraitlocidataandgraphembeddedneuralnetworkstouncovergenotypephenotypeinteractions
AT jinyuhan usingexpressionquantitativetraitlocidataandgraphembeddedneuralnetworkstouncovergenotypephenotypeinteractions
AT yafeisong usingexpressionquantitativetraitlocidataandgraphembeddedneuralnetworkstouncovergenotypephenotypeinteractions
AT zhileiyin usingexpressionquantitativetraitlocidataandgraphembeddedneuralnetworkstouncovergenotypephenotypeinteractions
AT shuaichenliu usingexpressionquantitativetraitlocidataandgraphembeddedneuralnetworkstouncovergenotypephenotypeinteractions
AT xuequnshang usingexpressionquantitativetraitlocidataandgraphembeddedneuralnetworkstouncovergenotypephenotypeinteractions