Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactions

Motivation: A central goal of current biology is to establish a complete functional link between the genotype and phenotype, known as the so-called genotype–phenotype map. With the continuous development of high-throughput technology and the decline in sequencing costs, multi-omics analysis has beco...

Full description

Bibliographic Details
Main Authors:	Xinpeng Guo, Jinyu Han, Yafei Song, Zhilei Yin, Shuaichen Liu, Xuequn Shang
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2022-08-01
Series:	Frontiers in Genetics
Subjects:	eQTL expression quantitative trait loci graph-embedded deep neural network genotype-phenotype SNP gene
Online Access:	https://www.frontiersin.org/articles/10.3389/fgene.2022.921775/full

_version_	1811321306796261376
author	Xinpeng Guo Xinpeng Guo Jinyu Han Yafei Song Zhilei Yin Shuaichen Liu Xuequn Shang
author_facet	Xinpeng Guo Xinpeng Guo Jinyu Han Yafei Song Zhilei Yin Shuaichen Liu Xuequn Shang
author_sort	Xinpeng Guo
collection	DOAJ
description	Motivation: A central goal of current biology is to establish a complete functional link between the genotype and phenotype, known as the so-called genotype–phenotype map. With the continuous development of high-throughput technology and the decline in sequencing costs, multi-omics analysis has become more widely employed. While this gives us new opportunities to uncover the correlation mechanisms between single-nucleotide polymorphism (SNP), genes, and phenotypes, multi-omics still faces certain challenges, specifically: 1) When the sample size is large enough, the number of omics types is often not large enough to meet the requirements of multi-omics analysis; 2) each omics’ internal correlations are often unclear, such as the correlation between genes in genomics; 3) when analyzing a large number of traits (p), the sample size (n) is often smaller than p, n << p, hindering the application of machine learning methods in the classification of disease outcomes.Results: To solve these issues with multi-omics and build a robust classification model, we propose a graph-embedded deep neural network (G-EDNN) based on expression quantitative trait loci (eQTL) data, which achieves sparse connectivity between network layers to prevent overfitting. The correlation within each omics is also considered such that the model more closely resembles biological reality. To verify the capabilities of this method, we conducted experimental analysis using the GSE28127 and GSE95496 data sets from the Gene Expression Omnibus (GEO) database, tested various neural network architectures, and used prior data for feature selection and graph embedding. Results show that the proposed method could achieve a high classification accuracy and easy-to-interpret feature selection. This method represents an extended application of genotype–phenotype association analysis in deep learning networks.
first_indexed	2024-04-13T13:15:43Z
format	Article
id	doaj.art-e4766974a7784ed2baab5e22aac3cbfe
institution	Directory Open Access Journal
issn	1664-8021
language	English
last_indexed	2024-04-13T13:15:43Z
publishDate	2022-08-01
publisher	Frontiers Media S.A.
record_format	Article
series	Frontiers in Genetics
spelling	doaj.art-e4766974a7784ed2baab5e22aac3cbfe2022-12-22T02:45:29ZengFrontiers Media S.A.Frontiers in Genetics1664-80212022-08-011310.3389/fgene.2022.921775921775Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactionsXinpeng Guo0Xinpeng Guo1Jinyu Han2Yafei Song3Zhilei Yin4Shuaichen Liu5Xuequn Shang6School of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an, ChinaSchool of Air and Missile Defense, Air Force Engineering University, Xi’an, ChinaSchool of Economics and Management, Chang ‘an University, Xi’an, ChinaSchool of Air and Missile Defense, Air Force Engineering University, Xi’an, ChinaSchool of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an, ChinaSchool of Marine Science and Technology, Northwestern Polytechnical University, Xi’an, ChinaSchool of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an, ChinaMotivation: A central goal of current biology is to establish a complete functional link between the genotype and phenotype, known as the so-called genotype–phenotype map. With the continuous development of high-throughput technology and the decline in sequencing costs, multi-omics analysis has become more widely employed. While this gives us new opportunities to uncover the correlation mechanisms between single-nucleotide polymorphism (SNP), genes, and phenotypes, multi-omics still faces certain challenges, specifically: 1) When the sample size is large enough, the number of omics types is often not large enough to meet the requirements of multi-omics analysis; 2) each omics’ internal correlations are often unclear, such as the correlation between genes in genomics; 3) when analyzing a large number of traits (p), the sample size (n) is often smaller than p, n << p, hindering the application of machine learning methods in the classification of disease outcomes.Results: To solve these issues with multi-omics and build a robust classification model, we propose a graph-embedded deep neural network (G-EDNN) based on expression quantitative trait loci (eQTL) data, which achieves sparse connectivity between network layers to prevent overfitting. The correlation within each omics is also considered such that the model more closely resembles biological reality. To verify the capabilities of this method, we conducted experimental analysis using the GSE28127 and GSE95496 data sets from the Gene Expression Omnibus (GEO) database, tested various neural network architectures, and used prior data for feature selection and graph embedding. Results show that the proposed method could achieve a high classification accuracy and easy-to-interpret feature selection. This method represents an extended application of genotype–phenotype association analysis in deep learning networks.https://www.frontiersin.org/articles/10.3389/fgene.2022.921775/fulleQTLexpression quantitative trait locigraph-embedded deep neural networkgenotype-phenotypeSNPgene
spellingShingle	Xinpeng Guo Xinpeng Guo Jinyu Han Yafei Song Zhilei Yin Shuaichen Liu Xuequn Shang Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactions Frontiers in Genetics eQTL expression quantitative trait loci graph-embedded deep neural network genotype-phenotype SNP gene
title	Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactions
title_full	Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactions
title_fullStr	Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactions
title_full_unstemmed	Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactions
title_short	Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactions
title_sort	using expression quantitative trait loci data and graph embedded neural networks to uncover genotype phenotype interactions
topic	eQTL expression quantitative trait loci graph-embedded deep neural network genotype-phenotype SNP gene
url	https://www.frontiersin.org/articles/10.3389/fgene.2022.921775/full
work_keys_str_mv	AT xinpengguo usingexpressionquantitativetraitlocidataandgraphembeddedneuralnetworkstouncovergenotypephenotypeinteractions AT xinpengguo usingexpressionquantitativetraitlocidataandgraphembeddedneuralnetworkstouncovergenotypephenotypeinteractions AT jinyuhan usingexpressionquantitativetraitlocidataandgraphembeddedneuralnetworkstouncovergenotypephenotypeinteractions AT yafeisong usingexpressionquantitativetraitlocidataandgraphembeddedneuralnetworkstouncovergenotypephenotypeinteractions AT zhileiyin usingexpressionquantitativetraitlocidataandgraphembeddedneuralnetworkstouncovergenotypephenotypeinteractions AT shuaichenliu usingexpressionquantitativetraitlocidataandgraphembeddedneuralnetworkstouncovergenotypephenotypeinteractions AT xuequnshang usingexpressionquantitativetraitlocidataandgraphembeddedneuralnetworkstouncovergenotypephenotypeinteractions

Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactions

Similar Items