Rapid, Reference-Free human genotype imputation with denoising autoencoders

Genotype imputation is a foundational tool for population genetics. Standard statistical imputation approaches rely on the co-location of large whole-genome sequencing-based reference panels, powerful computing environments, and potentially sensitive genetic study data. This results in computational...

Full description

Bibliographic Details
Main Authors: Raquel Dias, Doug Evans, Shang-Fu Chen, Kai-Yu Chen, Salvatore Loguercio, Leslie Chan, Ali Torkamani
Format: Article
Language:English
Published: eLife Sciences Publications Ltd 2022-09-01
Series:eLife
Subjects:
Online Access:https://elifesciences.org/articles/75600
_version_ 1811251290914684928
author Raquel Dias
Doug Evans
Shang-Fu Chen
Kai-Yu Chen
Salvatore Loguercio
Leslie Chan
Ali Torkamani
author_facet Raquel Dias
Doug Evans
Shang-Fu Chen
Kai-Yu Chen
Salvatore Loguercio
Leslie Chan
Ali Torkamani
author_sort Raquel Dias
collection DOAJ
description Genotype imputation is a foundational tool for population genetics. Standard statistical imputation approaches rely on the co-location of large whole-genome sequencing-based reference panels, powerful computing environments, and potentially sensitive genetic study data. This results in computational resource and privacy-risk barriers to access to cutting-edge imputation techniques. Moreover, the accuracy of current statistical approaches is known to degrade in regions of low and complex linkage disequilibrium. Artificial neural network-based imputation approaches may overcome these limitations by encoding complex genotype relationships in easily portable inference models. Here, we demonstrate an autoencoder-based approach for genotype imputation, using a large, commonly used reference panel, and spanning the entirety of human chromosome 22. Our autoencoder-based genotype imputation strategy achieved superior imputation accuracy across the allele-frequency spectrum and across genomes of diverse ancestry, while delivering at least fourfold faster inference run time relative to standard imputation tools.
first_indexed 2024-04-12T16:17:18Z
format Article
id doaj.art-6da6df6501724033943328bdb93d9bb0
institution Directory Open Access Journal
issn 2050-084X
language English
last_indexed 2024-04-12T16:17:18Z
publishDate 2022-09-01
publisher eLife Sciences Publications Ltd
record_format Article
series eLife
spelling doaj.art-6da6df6501724033943328bdb93d9bb02022-12-22T03:25:41ZengeLife Sciences Publications LtdeLife2050-084X2022-09-011110.7554/eLife.75600Rapid, Reference-Free human genotype imputation with denoising autoencodersRaquel Dias0Doug Evans1Shang-Fu Chen2Kai-Yu Chen3Salvatore Loguercio4Leslie Chan5Ali Torkamani6https://orcid.org/0000-0003-0232-8053Scripps Research Translational Institute, Scripps Research Institute, La Jolla, United States; Department of Integrative Structural and Computational Biology, Scripps Research, La Jolla, United States; Department of Microbiology and Cell Science, University of Florida, Gainesville, United StatesScripps Research Translational Institute, Scripps Research Institute, La Jolla, United States; Department of Integrative Structural and Computational Biology, Scripps Research, La Jolla, United StatesScripps Research Translational Institute, Scripps Research Institute, La Jolla, United States; Department of Integrative Structural and Computational Biology, Scripps Research, La Jolla, United StatesScripps Research Translational Institute, Scripps Research Institute, La Jolla, United States; Department of Integrative Structural and Computational Biology, Scripps Research, La Jolla, United StatesScripps Research Translational Institute, Scripps Research Institute, La Jolla, United States; Department of Integrative Structural and Computational Biology, Scripps Research, La Jolla, United StatesScripps Research Translational Institute, Scripps Research Institute, La Jolla, United States; Department of Integrative Structural and Computational Biology, Scripps Research, La Jolla, United StatesScripps Research Translational Institute, Scripps Research Institute, La Jolla, United States; Department of Integrative Structural and Computational Biology, Scripps Research, La Jolla, United StatesGenotype imputation is a foundational tool for population genetics. Standard statistical imputation approaches rely on the co-location of large whole-genome sequencing-based reference panels, powerful computing environments, and potentially sensitive genetic study data. This results in computational resource and privacy-risk barriers to access to cutting-edge imputation techniques. Moreover, the accuracy of current statistical approaches is known to degrade in regions of low and complex linkage disequilibrium. Artificial neural network-based imputation approaches may overcome these limitations by encoding complex genotype relationships in easily portable inference models. Here, we demonstrate an autoencoder-based approach for genotype imputation, using a large, commonly used reference panel, and spanning the entirety of human chromosome 22. Our autoencoder-based genotype imputation strategy achieved superior imputation accuracy across the allele-frequency spectrum and across genomes of diverse ancestry, while delivering at least fourfold faster inference run time relative to standard imputation tools.https://elifesciences.org/articles/75600imputationdeep learningartifitial intelligencepopulation geneticsgenomicsautoencoder
spellingShingle Raquel Dias
Doug Evans
Shang-Fu Chen
Kai-Yu Chen
Salvatore Loguercio
Leslie Chan
Ali Torkamani
Rapid, Reference-Free human genotype imputation with denoising autoencoders
eLife
imputation
deep learning
artifitial intelligence
population genetics
genomics
autoencoder
title Rapid, Reference-Free human genotype imputation with denoising autoencoders
title_full Rapid, Reference-Free human genotype imputation with denoising autoencoders
title_fullStr Rapid, Reference-Free human genotype imputation with denoising autoencoders
title_full_unstemmed Rapid, Reference-Free human genotype imputation with denoising autoencoders
title_short Rapid, Reference-Free human genotype imputation with denoising autoencoders
title_sort rapid reference free human genotype imputation with denoising autoencoders
topic imputation
deep learning
artifitial intelligence
population genetics
genomics
autoencoder
url https://elifesciences.org/articles/75600
work_keys_str_mv AT raqueldias rapidreferencefreehumangenotypeimputationwithdenoisingautoencoders
AT dougevans rapidreferencefreehumangenotypeimputationwithdenoisingautoencoders
AT shangfuchen rapidreferencefreehumangenotypeimputationwithdenoisingautoencoders
AT kaiyuchen rapidreferencefreehumangenotypeimputationwithdenoisingautoencoders
AT salvatoreloguercio rapidreferencefreehumangenotypeimputationwithdenoisingautoencoders
AT lesliechan rapidreferencefreehumangenotypeimputationwithdenoisingautoencoders
AT alitorkamani rapidreferencefreehumangenotypeimputationwithdenoisingautoencoders