Empirical distributions of F(ST) from large-scale human polymorphism data.

Studies of the apportionment of human genetic variation have long established that most human variation is within population groups and that the additional variation between population groups is small but greatest when comparing different continental populations. These studies often used Wright'...

Full description

Bibliographic Details
Main Author: Eran Elhaik
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2012-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC3504095?pdf=render
_version_ 1811288870035128320
author Eran Elhaik
author_facet Eran Elhaik
author_sort Eran Elhaik
collection DOAJ
description Studies of the apportionment of human genetic variation have long established that most human variation is within population groups and that the additional variation between population groups is small but greatest when comparing different continental populations. These studies often used Wright's F(ST) that apportions the standardized variance in allele frequencies within and between population groups. Because local adaptations increase population differentiation, high-F(ST) may be found at closely linked loci under selection and used to identify genes undergoing directional or heterotic selection. We re-examined these processes using HapMap data. We analyzed 3 million SNPs on 602 samples from eight worldwide populations and a consensus subset of 1 million SNPs found in all populations. We identified four major features of the data: First, a hierarchically F(ST) analysis showed that only a paucity (12%) of the total genetic variation is distributed between continental populations and even a lesser genetic variation (1%) is found between intra-continental populations. Second, the global F(ST) distribution closely follows an exponential distribution. Third, although the overall F(ST) distribution is similarly shaped (inverse J), F(ST) distributions varies markedly by allele frequency when divided into non-overlapping groups by allele frequency range. Because the mean allele frequency is a crude indicator of allele age, these distributions mark the time-dependent change in genetic differentiation. Finally, the change in mean-F(ST) of these groups is linear in allele frequency. These results suggest that investigating the extremes of the F(ST) distribution for each allele frequency group is more efficient for detecting selection. Consequently, we demonstrate that such extreme SNPs are more clustered along the chromosomes than expected from linkage disequilibrium for each allele frequency group. These genomic regions are therefore likely candidates for natural selection.
first_indexed 2024-04-13T03:44:57Z
format Article
id doaj.art-b2b2de95e3f34ad49808581cc5061cd8
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-04-13T03:44:57Z
publishDate 2012-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-b2b2de95e3f34ad49808581cc5061cd82022-12-22T03:04:03ZengPublic Library of Science (PLoS)PLoS ONE1932-62032012-01-01711e4983710.1371/journal.pone.0049837Empirical distributions of F(ST) from large-scale human polymorphism data.Eran ElhaikStudies of the apportionment of human genetic variation have long established that most human variation is within population groups and that the additional variation between population groups is small but greatest when comparing different continental populations. These studies often used Wright's F(ST) that apportions the standardized variance in allele frequencies within and between population groups. Because local adaptations increase population differentiation, high-F(ST) may be found at closely linked loci under selection and used to identify genes undergoing directional or heterotic selection. We re-examined these processes using HapMap data. We analyzed 3 million SNPs on 602 samples from eight worldwide populations and a consensus subset of 1 million SNPs found in all populations. We identified four major features of the data: First, a hierarchically F(ST) analysis showed that only a paucity (12%) of the total genetic variation is distributed between continental populations and even a lesser genetic variation (1%) is found between intra-continental populations. Second, the global F(ST) distribution closely follows an exponential distribution. Third, although the overall F(ST) distribution is similarly shaped (inverse J), F(ST) distributions varies markedly by allele frequency when divided into non-overlapping groups by allele frequency range. Because the mean allele frequency is a crude indicator of allele age, these distributions mark the time-dependent change in genetic differentiation. Finally, the change in mean-F(ST) of these groups is linear in allele frequency. These results suggest that investigating the extremes of the F(ST) distribution for each allele frequency group is more efficient for detecting selection. Consequently, we demonstrate that such extreme SNPs are more clustered along the chromosomes than expected from linkage disequilibrium for each allele frequency group. These genomic regions are therefore likely candidates for natural selection.http://europepmc.org/articles/PMC3504095?pdf=render
spellingShingle Eran Elhaik
Empirical distributions of F(ST) from large-scale human polymorphism data.
PLoS ONE
title Empirical distributions of F(ST) from large-scale human polymorphism data.
title_full Empirical distributions of F(ST) from large-scale human polymorphism data.
title_fullStr Empirical distributions of F(ST) from large-scale human polymorphism data.
title_full_unstemmed Empirical distributions of F(ST) from large-scale human polymorphism data.
title_short Empirical distributions of F(ST) from large-scale human polymorphism data.
title_sort empirical distributions of f st from large scale human polymorphism data
url http://europepmc.org/articles/PMC3504095?pdf=render
work_keys_str_mv AT eranelhaik empiricaldistributionsoffstfromlargescalehumanpolymorphismdata