US Population Data for 94 Identity-Informative SNP Loci

The US National Institute of Standards and Technology (NIST) analyzed a set of 1036 samples representing four major US population groups (African American, Asian American, Caucasian, and Hispanic) with 94 single nucleotide polymorphisms (SNPs) used for individual identification (iiSNPs). The compact...

Full description

Bibliographic Details
Main Authors: Kevin M. Kiesler, Lisa A. Borsuk, Carolyn R. Steffen, Peter M. Vallone, Katherine B. Gettings
Format: Article
Language:English
Published: MDPI AG 2023-05-01
Series:Genes
Subjects:
Online Access:https://www.mdpi.com/2073-4425/14/5/1071
_version_ 1797599936694452224
author Kevin M. Kiesler
Lisa A. Borsuk
Carolyn R. Steffen
Peter M. Vallone
Katherine B. Gettings
author_facet Kevin M. Kiesler
Lisa A. Borsuk
Carolyn R. Steffen
Peter M. Vallone
Katherine B. Gettings
author_sort Kevin M. Kiesler
collection DOAJ
description The US National Institute of Standards and Technology (NIST) analyzed a set of 1036 samples representing four major US population groups (African American, Asian American, Caucasian, and Hispanic) with 94 single nucleotide polymorphisms (SNPs) used for individual identification (iiSNPs). The compact size of iiSNP amplicons compared to short tandem repeat (STR) markers increases the likelihood of successful amplification with degraded DNA samples. Allele frequencies and relevant forensic statistics were calculated for each population group as well as the aggregate population sample. Examination of sequence data in the regions flanking the targeted SNPs identified additional variants, which can be combined with the target SNPs to form microhaplotypes (multiple phased SNPs within a short-read sequence). Comparison of iiSNP performance with and without flanking SNP variation identified four amplicons containing microhaplotypes with observed heterozygosity increases of greater than 15% over the targeted SNP alone. For this set of 1036 samples, comparison of average match probabilities from iiSNPs with the 20 CODIS core STR markers yielded an estimate of 1.7 × 10<sup>−38</sup> for iiSNPs (assuming independence between all 94 SNPs), which was four orders of magnitude lower (more discriminating) than STRs where internal sequence variation was considered, and 10 orders of magnitude lower than STRs using established capillary electrophoresis length-based genotypes.
first_indexed 2024-03-11T03:42:28Z
format Article
id doaj.art-3dd8568a83404864842f71b212def0c4
institution Directory Open Access Journal
issn 2073-4425
language English
last_indexed 2024-03-11T03:42:28Z
publishDate 2023-05-01
publisher MDPI AG
record_format Article
series Genes
spelling doaj.art-3dd8568a83404864842f71b212def0c42023-11-18T01:30:04ZengMDPI AGGenes2073-44252023-05-01145107110.3390/genes14051071US Population Data for 94 Identity-Informative SNP LociKevin M. Kiesler0Lisa A. Borsuk1Carolyn R. Steffen2Peter M. Vallone3Katherine B. Gettings4National Institute of Standards and Technology, 100 Bureau Drive, Mailstop 8314, Gaithersburg, MD 20899, USANational Institute of Standards and Technology, 100 Bureau Drive, Mailstop 8314, Gaithersburg, MD 20899, USANational Institute of Standards and Technology, 100 Bureau Drive, Mailstop 8314, Gaithersburg, MD 20899, USANational Institute of Standards and Technology, 100 Bureau Drive, Mailstop 8314, Gaithersburg, MD 20899, USANational Institute of Standards and Technology, 100 Bureau Drive, Mailstop 8314, Gaithersburg, MD 20899, USAThe US National Institute of Standards and Technology (NIST) analyzed a set of 1036 samples representing four major US population groups (African American, Asian American, Caucasian, and Hispanic) with 94 single nucleotide polymorphisms (SNPs) used for individual identification (iiSNPs). The compact size of iiSNP amplicons compared to short tandem repeat (STR) markers increases the likelihood of successful amplification with degraded DNA samples. Allele frequencies and relevant forensic statistics were calculated for each population group as well as the aggregate population sample. Examination of sequence data in the regions flanking the targeted SNPs identified additional variants, which can be combined with the target SNPs to form microhaplotypes (multiple phased SNPs within a short-read sequence). Comparison of iiSNP performance with and without flanking SNP variation identified four amplicons containing microhaplotypes with observed heterozygosity increases of greater than 15% over the targeted SNP alone. For this set of 1036 samples, comparison of average match probabilities from iiSNPs with the 20 CODIS core STR markers yielded an estimate of 1.7 × 10<sup>−38</sup> for iiSNPs (assuming independence between all 94 SNPs), which was four orders of magnitude lower (more discriminating) than STRs where internal sequence variation was considered, and 10 orders of magnitude lower than STRs using established capillary electrophoresis length-based genotypes.https://www.mdpi.com/2073-4425/14/5/1071single nucleotide polymorphismhuman identificationmicrohaplotypenext generation sequencing
spellingShingle Kevin M. Kiesler
Lisa A. Borsuk
Carolyn R. Steffen
Peter M. Vallone
Katherine B. Gettings
US Population Data for 94 Identity-Informative SNP Loci
Genes
single nucleotide polymorphism
human identification
microhaplotype
next generation sequencing
title US Population Data for 94 Identity-Informative SNP Loci
title_full US Population Data for 94 Identity-Informative SNP Loci
title_fullStr US Population Data for 94 Identity-Informative SNP Loci
title_full_unstemmed US Population Data for 94 Identity-Informative SNP Loci
title_short US Population Data for 94 Identity-Informative SNP Loci
title_sort us population data for 94 identity informative snp loci
topic single nucleotide polymorphism
human identification
microhaplotype
next generation sequencing
url https://www.mdpi.com/2073-4425/14/5/1071
work_keys_str_mv AT kevinmkiesler uspopulationdatafor94identityinformativesnploci
AT lisaaborsuk uspopulationdatafor94identityinformativesnploci
AT carolynrsteffen uspopulationdatafor94identityinformativesnploci
AT petermvallone uspopulationdatafor94identityinformativesnploci
AT katherinebgettings uspopulationdatafor94identityinformativesnploci