SNP imputation bias reduces effect size determination

Imputation is a commonly used technique that exploits linkage disequilibrium to infer missing genotypes in genetic datasets, using a well characterized reference population. While there is agreement that the reference population has to match the ethnicity of the query dataset, it is common practice...

Full description

Bibliographic Details
Main Authors: Pouya eKhankhanian, Lennox eDin, Stacy J Caillier, Pierre-Antoine eGourraud, Sergio E Baranzini
Format: Article
Language:English
Published: Frontiers Media S.A. 2015-02-01
Series:Frontiers in Genetics
Subjects:
Online Access:http://journal.frontiersin.org/Journal/10.3389/fgene.2015.00030/full
_version_ 1819014152968994816
author Pouya eKhankhanian
Lennox eDin
Stacy J Caillier
Pierre-Antoine eGourraud
Sergio E Baranzini
author_facet Pouya eKhankhanian
Lennox eDin
Stacy J Caillier
Pierre-Antoine eGourraud
Sergio E Baranzini
author_sort Pouya eKhankhanian
collection DOAJ
description Imputation is a commonly used technique that exploits linkage disequilibrium to infer missing genotypes in genetic datasets, using a well characterized reference population. While there is agreement that the reference population has to match the ethnicity of the query dataset, it is common practice to use the same reference to impute genotypes for a wide variety of phenotypes. We hypothesized that using a reference composed of samples with a different phenotype than the query dataset would introduce imputation bias.To test this hypothesis we used GWAS datasets from amyotrophic lateral sclerosis, Parkinson disease, and Crohn disease. First, we masked and then performed imputation of 100 disease-associated markers and 100 non-associated markers from each study. Two references for imputation were used in parallel: one consisting of healthy controls and another consisting of patients with the same disease. We assessed the discordance (imprecision) and bias (inaccuracy) of imputation by comparing predicted genotypes to those assayed by SNP-chip. We also assessed the bias on the observed effect size when the predicted genotypes were used in a GWAS study.When healthy controls were used as reference for imputation, a significant bias was observed, particularly in the disease-associated markers. Using cases as reference significantly attenuated this bias. For nearly all markers, the direction of the bias favored the non-risk allele. In GWAS studies of the three diseases (with healthy reference controls from the 1000 genomes as reference), the mean OR for disease-associated markers obtained by imputation was lower than that obtained using original assayed genotypes.We found that the bias is inherent to imputation as using different methods did not alter the results. In conclusion, imputation is a powerful method to predict genotypes and estimate genetic risk for GWAS. However, a careful choice of reference population is needed to minimize biases inherent to this approach
first_indexed 2024-12-21T02:11:18Z
format Article
id doaj.art-ce550aa3a1fa468b81ae85eb07fb04cf
institution Directory Open Access Journal
issn 1664-8021
language English
last_indexed 2024-12-21T02:11:18Z
publishDate 2015-02-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Genetics
spelling doaj.art-ce550aa3a1fa468b81ae85eb07fb04cf2022-12-21T19:19:22ZengFrontiers Media S.A.Frontiers in Genetics1664-80212015-02-01610.3389/fgene.2015.00030122992SNP imputation bias reduces effect size determinationPouya eKhankhanian0Lennox eDin1Stacy J Caillier2Pierre-Antoine eGourraud3Sergio E Baranzini4University of California San FranciscoUniversity of California San FranciscoUniversity of California San FranciscoUniversity of California San FranciscoUniversity of California San FranciscoImputation is a commonly used technique that exploits linkage disequilibrium to infer missing genotypes in genetic datasets, using a well characterized reference population. While there is agreement that the reference population has to match the ethnicity of the query dataset, it is common practice to use the same reference to impute genotypes for a wide variety of phenotypes. We hypothesized that using a reference composed of samples with a different phenotype than the query dataset would introduce imputation bias.To test this hypothesis we used GWAS datasets from amyotrophic lateral sclerosis, Parkinson disease, and Crohn disease. First, we masked and then performed imputation of 100 disease-associated markers and 100 non-associated markers from each study. Two references for imputation were used in parallel: one consisting of healthy controls and another consisting of patients with the same disease. We assessed the discordance (imprecision) and bias (inaccuracy) of imputation by comparing predicted genotypes to those assayed by SNP-chip. We also assessed the bias on the observed effect size when the predicted genotypes were used in a GWAS study.When healthy controls were used as reference for imputation, a significant bias was observed, particularly in the disease-associated markers. Using cases as reference significantly attenuated this bias. For nearly all markers, the direction of the bias favored the non-risk allele. In GWAS studies of the three diseases (with healthy reference controls from the 1000 genomes as reference), the mean OR for disease-associated markers obtained by imputation was lower than that obtained using original assayed genotypes.We found that the bias is inherent to imputation as using different methods did not alter the results. In conclusion, imputation is a powerful method to predict genotypes and estimate genetic risk for GWAS. However, a careful choice of reference population is needed to minimize biases inherent to this approachhttp://journal.frontiersin.org/Journal/10.3389/fgene.2015.00030/fullGWASSNPaccuracyImputationBias Correction
spellingShingle Pouya eKhankhanian
Lennox eDin
Stacy J Caillier
Pierre-Antoine eGourraud
Sergio E Baranzini
SNP imputation bias reduces effect size determination
Frontiers in Genetics
GWAS
SNP
accuracy
Imputation
Bias Correction
title SNP imputation bias reduces effect size determination
title_full SNP imputation bias reduces effect size determination
title_fullStr SNP imputation bias reduces effect size determination
title_full_unstemmed SNP imputation bias reduces effect size determination
title_short SNP imputation bias reduces effect size determination
title_sort snp imputation bias reduces effect size determination
topic GWAS
SNP
accuracy
Imputation
Bias Correction
url http://journal.frontiersin.org/Journal/10.3389/fgene.2015.00030/full
work_keys_str_mv AT pouyaekhankhanian snpimputationbiasreduceseffectsizedetermination
AT lennoxedin snpimputationbiasreduceseffectsizedetermination
AT stacyjcaillier snpimputationbiasreduceseffectsizedetermination
AT pierreantoineegourraud snpimputationbiasreduceseffectsizedetermination
AT sergioebaranzini snpimputationbiasreduceseffectsizedetermination