A New Computational Deconvolution Algorithm for the Analysis of Forensic DNA Mixtures with SNP Markers
Single nucleotide polymorphisms (SNPs) support robust analysis on degraded DNA samples. However, the development of a systematic method to interpret the profiles derived from the mixtures is less studied, and it remains a challenge due to the bi-allelic nature of SNP markers. To improve the discrimi...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-05-01
|
Series: | Genes |
Subjects: | |
Online Access: | https://www.mdpi.com/2073-4425/13/5/884 |
_version_ | 1797499593132343296 |
---|---|
author | Yu Yin Peng Zhang Yu Xing |
author_facet | Yu Yin Peng Zhang Yu Xing |
author_sort | Yu Yin |
collection | DOAJ |
description | Single nucleotide polymorphisms (SNPs) support robust analysis on degraded DNA samples. However, the development of a systematic method to interpret the profiles derived from the mixtures is less studied, and it remains a challenge due to the bi-allelic nature of SNP markers. To improve the discriminating power of SNPs, this study explored bioinformatic strategies to analyze mixtures. Then, computer-generated mixtures were produced using real-world massively parallel sequencing (MPS) data from the single samples processed with the Precision ID Identity Panel. Moreover, the values of the frequency of major allele reads (<i>F<sub>MAR</sub></i>) were calculated and applied as key parameters to deconvolve the two-person mixtures and estimate mixture ratios. Four custom R language scripts (three for autosomes and one for Y chromosome) were designed with the K-means clustering method as a core algorithm. Finally, the method was validated with real-world mixtures. The results indicated that the deconvolution accuracy for evenly balanced mixtures was 100% or close to 100%, which was the same as the deconvolution accuracy of inferring the genotypes of the major contributor of unevenly balanced mixtures. Meanwhile, the accuracy of inferring the genotypes of the minor contributor decreased as its proportion in the mixture decreased. Moreover, the estimated mixture ratio was almost equal to the actual ratio between 1:1 and 1:6. The method proposed in this study provides a new paradigm for mixture interpretation, especially for inferring contributor profiles of evenly balanced mixtures and the major contributor profile of unevenly balanced mixtures. |
first_indexed | 2024-03-10T03:49:37Z |
format | Article |
id | doaj.art-c2e26296cf964e85a8a6f9523550b453 |
institution | Directory Open Access Journal |
issn | 2073-4425 |
language | English |
last_indexed | 2024-03-10T03:49:37Z |
publishDate | 2022-05-01 |
publisher | MDPI AG |
record_format | Article |
series | Genes |
spelling | doaj.art-c2e26296cf964e85a8a6f9523550b4532023-11-23T11:11:03ZengMDPI AGGenes2073-44252022-05-0113588410.3390/genes13050884A New Computational Deconvolution Algorithm for the Analysis of Forensic DNA Mixtures with SNP MarkersYu Yin0Peng Zhang1Yu Xing2Department of Forensic Medicine, Chongqing Medical University, #1 Yixueyuan Road, Chongqing 400016, ChinaDepartment of Forensic Medicine, Chongqing Medical University, #1 Yixueyuan Road, Chongqing 400016, ChinaDepartment of Forensic Medicine, Chongqing Medical University, #1 Yixueyuan Road, Chongqing 400016, ChinaSingle nucleotide polymorphisms (SNPs) support robust analysis on degraded DNA samples. However, the development of a systematic method to interpret the profiles derived from the mixtures is less studied, and it remains a challenge due to the bi-allelic nature of SNP markers. To improve the discriminating power of SNPs, this study explored bioinformatic strategies to analyze mixtures. Then, computer-generated mixtures were produced using real-world massively parallel sequencing (MPS) data from the single samples processed with the Precision ID Identity Panel. Moreover, the values of the frequency of major allele reads (<i>F<sub>MAR</sub></i>) were calculated and applied as key parameters to deconvolve the two-person mixtures and estimate mixture ratios. Four custom R language scripts (three for autosomes and one for Y chromosome) were designed with the K-means clustering method as a core algorithm. Finally, the method was validated with real-world mixtures. The results indicated that the deconvolution accuracy for evenly balanced mixtures was 100% or close to 100%, which was the same as the deconvolution accuracy of inferring the genotypes of the major contributor of unevenly balanced mixtures. Meanwhile, the accuracy of inferring the genotypes of the minor contributor decreased as its proportion in the mixture decreased. Moreover, the estimated mixture ratio was almost equal to the actual ratio between 1:1 and 1:6. The method proposed in this study provides a new paradigm for mixture interpretation, especially for inferring contributor profiles of evenly balanced mixtures and the major contributor profile of unevenly balanced mixtures.https://www.mdpi.com/2073-4425/13/5/884forensic geneticsbioinformaticssingle nucleotide polymorphism (SNP)massively parallel sequencing (MPS)Precision ID Identity PanelDNA mixture deconvolution |
spellingShingle | Yu Yin Peng Zhang Yu Xing A New Computational Deconvolution Algorithm for the Analysis of Forensic DNA Mixtures with SNP Markers Genes forensic genetics bioinformatics single nucleotide polymorphism (SNP) massively parallel sequencing (MPS) Precision ID Identity Panel DNA mixture deconvolution |
title | A New Computational Deconvolution Algorithm for the Analysis of Forensic DNA Mixtures with SNP Markers |
title_full | A New Computational Deconvolution Algorithm for the Analysis of Forensic DNA Mixtures with SNP Markers |
title_fullStr | A New Computational Deconvolution Algorithm for the Analysis of Forensic DNA Mixtures with SNP Markers |
title_full_unstemmed | A New Computational Deconvolution Algorithm for the Analysis of Forensic DNA Mixtures with SNP Markers |
title_short | A New Computational Deconvolution Algorithm for the Analysis of Forensic DNA Mixtures with SNP Markers |
title_sort | new computational deconvolution algorithm for the analysis of forensic dna mixtures with snp markers |
topic | forensic genetics bioinformatics single nucleotide polymorphism (SNP) massively parallel sequencing (MPS) Precision ID Identity Panel DNA mixture deconvolution |
url | https://www.mdpi.com/2073-4425/13/5/884 |
work_keys_str_mv | AT yuyin anewcomputationaldeconvolutionalgorithmfortheanalysisofforensicdnamixtureswithsnpmarkers AT pengzhang anewcomputationaldeconvolutionalgorithmfortheanalysisofforensicdnamixtureswithsnpmarkers AT yuxing anewcomputationaldeconvolutionalgorithmfortheanalysisofforensicdnamixtureswithsnpmarkers AT yuyin newcomputationaldeconvolutionalgorithmfortheanalysisofforensicdnamixtureswithsnpmarkers AT pengzhang newcomputationaldeconvolutionalgorithmfortheanalysisofforensicdnamixtureswithsnpmarkers AT yuxing newcomputationaldeconvolutionalgorithmfortheanalysisofforensicdnamixtureswithsnpmarkers |