Joint genotype calling with array and sequence data.

Analysis of rare variants is currently a major focus of genetic studies of human disease. Single-nucleotide polymorphism (SNP) genotypes can be assayed using microarray genotyping or by sequencing, but neither technology produces perfect genotype calls, especially at rare SNPs. Studies that collect...

Full description

Bibliographic Details
Main Authors: O'Connell, J, Marchini, J
Format: Journal article
Language:English
Published: 2012
_version_ 1826304138029301760
author O'Connell, J
Marchini, J
author_facet O'Connell, J
Marchini, J
author_sort O'Connell, J
collection OXFORD
description Analysis of rare variants is currently a major focus of genetic studies of human disease. Single-nucleotide polymorphism (SNP) genotypes can be assayed using microarray genotyping or by sequencing, but neither technology produces perfect genotype calls, especially at rare SNPs. Studies that collect both types of data are becoming increasingly common, so it may be possible to combine data types to increase accuracy. We present a method, called Chiamante, which calls genotypes on individuals with either array data, sequence data, or both. The model adapts to data quality and can estimate when either the array or the sequence data should be ignored when calling the genotypes at each SNP. As a special case, our method will call genotypes from only array data and outperforms existing methods in this scenario. We have applied our method to array and sequence data from Phase I of the 1000 Genomes Project and show that it provides improved performance, especially at rare SNPs. This method provides a foundation for future efforts to fuse genetic data from different sources, for example, when combining data from exome sequencing and exome microarrays.
first_indexed 2024-03-07T06:13:15Z
format Journal article
id oxford-uuid:f03d7174-07f8-4bf2-9065-0c7492e35dfa
institution University of Oxford
language English
last_indexed 2024-03-07T06:13:15Z
publishDate 2012
record_format dspace
spelling oxford-uuid:f03d7174-07f8-4bf2-9065-0c7492e35dfa2022-03-27T11:46:20ZJoint genotype calling with array and sequence data.Journal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:f03d7174-07f8-4bf2-9065-0c7492e35dfaEnglishSymplectic Elements at Oxford2012O'Connell, JMarchini, JAnalysis of rare variants is currently a major focus of genetic studies of human disease. Single-nucleotide polymorphism (SNP) genotypes can be assayed using microarray genotyping or by sequencing, but neither technology produces perfect genotype calls, especially at rare SNPs. Studies that collect both types of data are becoming increasingly common, so it may be possible to combine data types to increase accuracy. We present a method, called Chiamante, which calls genotypes on individuals with either array data, sequence data, or both. The model adapts to data quality and can estimate when either the array or the sequence data should be ignored when calling the genotypes at each SNP. As a special case, our method will call genotypes from only array data and outperforms existing methods in this scenario. We have applied our method to array and sequence data from Phase I of the 1000 Genomes Project and show that it provides improved performance, especially at rare SNPs. This method provides a foundation for future efforts to fuse genetic data from different sources, for example, when combining data from exome sequencing and exome microarrays.
spellingShingle O'Connell, J
Marchini, J
Joint genotype calling with array and sequence data.
title Joint genotype calling with array and sequence data.
title_full Joint genotype calling with array and sequence data.
title_fullStr Joint genotype calling with array and sequence data.
title_full_unstemmed Joint genotype calling with array and sequence data.
title_short Joint genotype calling with array and sequence data.
title_sort joint genotype calling with array and sequence data
work_keys_str_mv AT oconnellj jointgenotypecallingwitharrayandsequencedata
AT marchinij jointgenotypecallingwitharrayandsequencedata