High, clustered, nucleotide diversity in the genome of <it>Anopheles gambiae </it>revealed through pooled-template sequencing: implications for high-throughput genotyping protocols

<p>Abstract</p> <p>Background</p> <p>Association mapping approaches are dependent upon discovery and validation of single nucleotide polymorphisms (SNPs). To further association studies in <it>Anopheles gambiae </it>we conducted a major resequencing programm...

Full description

Bibliographic Details
Main Authors: Steen Keith, Weetman David, Wilding Craig S, Donnelly Martin J
Format: Article
Language:English
Published: BMC 2009-07-01
Series:BMC Genomics
Online Access:http://www.biomedcentral.com/1471-2164/10/320
_version_ 1818757138664652800
author Steen Keith
Weetman David
Wilding Craig S
Donnelly Martin J
author_facet Steen Keith
Weetman David
Wilding Craig S
Donnelly Martin J
author_sort Steen Keith
collection DOAJ
description <p>Abstract</p> <p>Background</p> <p>Association mapping approaches are dependent upon discovery and validation of single nucleotide polymorphisms (SNPs). To further association studies in <it>Anopheles gambiae </it>we conducted a major resequencing programme, primarily targeting regions within or close to candidate genes for insecticide resistance.</p> <p>Results</p> <p>Using two pools of mosquito template DNA we sequenced over 300 kbp across 660 distinct amplicons of the <it>An. gambiae </it>genome. Comparison of SNPs identified from pooled templates with those from individual sequences revealed a very low false positive rate. False negative rates were much higher and mostly resulted from SNPs with a low minor allele frequency. Pooled-template sequencing also provided good estimates of SNP allele frequencies. Allele frequency estimation success, along with false positive and negative call rates, improved significantly when using a qualitative measure of SNP call quality. We identified a total of 7062 polymorphic features comprising 6995 SNPs and 67 indels, with, on average, a SNP every 34 bp; a high rate of polymorphism that is comparable to other studies of mosquitoes. SNPs were significantly more frequent in members of the cytochrome p450 mono-oxygenases and carboxy/cholinesterase gene-families than in glutathione-S-transferases, other detoxification genes, and control genomic regions. Polymorphic sites showed a significantly clustered distribution, but the degree of SNP clustering (independent of SNP frequency) did not vary among gene families, suggesting that clustering of polymorphisms is a general property of the <it>An. gambiae </it>genome.</p> <p>Conclusion</p> <p>The high frequency and clustering of SNPs has important ramifications for the design of high-throughput genotyping assays based on allele specific primer extension or probe hybridisation. We illustrate these issues in the context of the design of Illumina GoldenGate assays.</p>
first_indexed 2024-12-18T06:06:10Z
format Article
id doaj.art-a83480e215564e0f8c113ace304b7f11
institution Directory Open Access Journal
issn 1471-2164
language English
last_indexed 2024-12-18T06:06:10Z
publishDate 2009-07-01
publisher BMC
record_format Article
series BMC Genomics
spelling doaj.art-a83480e215564e0f8c113ace304b7f112022-12-21T21:18:32ZengBMCBMC Genomics1471-21642009-07-0110132010.1186/1471-2164-10-320High, clustered, nucleotide diversity in the genome of <it>Anopheles gambiae </it>revealed through pooled-template sequencing: implications for high-throughput genotyping protocolsSteen KeithWeetman DavidWilding Craig SDonnelly Martin J<p>Abstract</p> <p>Background</p> <p>Association mapping approaches are dependent upon discovery and validation of single nucleotide polymorphisms (SNPs). To further association studies in <it>Anopheles gambiae </it>we conducted a major resequencing programme, primarily targeting regions within or close to candidate genes for insecticide resistance.</p> <p>Results</p> <p>Using two pools of mosquito template DNA we sequenced over 300 kbp across 660 distinct amplicons of the <it>An. gambiae </it>genome. Comparison of SNPs identified from pooled templates with those from individual sequences revealed a very low false positive rate. False negative rates were much higher and mostly resulted from SNPs with a low minor allele frequency. Pooled-template sequencing also provided good estimates of SNP allele frequencies. Allele frequency estimation success, along with false positive and negative call rates, improved significantly when using a qualitative measure of SNP call quality. We identified a total of 7062 polymorphic features comprising 6995 SNPs and 67 indels, with, on average, a SNP every 34 bp; a high rate of polymorphism that is comparable to other studies of mosquitoes. SNPs were significantly more frequent in members of the cytochrome p450 mono-oxygenases and carboxy/cholinesterase gene-families than in glutathione-S-transferases, other detoxification genes, and control genomic regions. Polymorphic sites showed a significantly clustered distribution, but the degree of SNP clustering (independent of SNP frequency) did not vary among gene families, suggesting that clustering of polymorphisms is a general property of the <it>An. gambiae </it>genome.</p> <p>Conclusion</p> <p>The high frequency and clustering of SNPs has important ramifications for the design of high-throughput genotyping assays based on allele specific primer extension or probe hybridisation. We illustrate these issues in the context of the design of Illumina GoldenGate assays.</p>http://www.biomedcentral.com/1471-2164/10/320
spellingShingle Steen Keith
Weetman David
Wilding Craig S
Donnelly Martin J
High, clustered, nucleotide diversity in the genome of <it>Anopheles gambiae </it>revealed through pooled-template sequencing: implications for high-throughput genotyping protocols
BMC Genomics
title High, clustered, nucleotide diversity in the genome of <it>Anopheles gambiae </it>revealed through pooled-template sequencing: implications for high-throughput genotyping protocols
title_full High, clustered, nucleotide diversity in the genome of <it>Anopheles gambiae </it>revealed through pooled-template sequencing: implications for high-throughput genotyping protocols
title_fullStr High, clustered, nucleotide diversity in the genome of <it>Anopheles gambiae </it>revealed through pooled-template sequencing: implications for high-throughput genotyping protocols
title_full_unstemmed High, clustered, nucleotide diversity in the genome of <it>Anopheles gambiae </it>revealed through pooled-template sequencing: implications for high-throughput genotyping protocols
title_short High, clustered, nucleotide diversity in the genome of <it>Anopheles gambiae </it>revealed through pooled-template sequencing: implications for high-throughput genotyping protocols
title_sort high clustered nucleotide diversity in the genome of it anopheles gambiae it revealed through pooled template sequencing implications for high throughput genotyping protocols
url http://www.biomedcentral.com/1471-2164/10/320
work_keys_str_mv AT steenkeith highclusterednucleotidediversityinthegenomeofitanophelesgambiaeitrevealedthroughpooledtemplatesequencingimplicationsforhighthroughputgenotypingprotocols
AT weetmandavid highclusterednucleotidediversityinthegenomeofitanophelesgambiaeitrevealedthroughpooledtemplatesequencingimplicationsforhighthroughputgenotypingprotocols
AT wildingcraigs highclusterednucleotidediversityinthegenomeofitanophelesgambiaeitrevealedthroughpooledtemplatesequencingimplicationsforhighthroughputgenotypingprotocols
AT donnellymartinj highclusterednucleotidediversityinthegenomeofitanophelesgambiaeitrevealedthroughpooledtemplatesequencingimplicationsforhighthroughputgenotypingprotocols