Assessing the impact of post-mortem damage and contamination on imputation performance in ancient DNA
Abstract Low-coverage imputation is becoming ever more present in ancient DNA (aDNA) studies. Imputation pipelines commonly used for present-day genomes have been shown to yield accurate results when applied to ancient genomes. However, post-mortem damage (PMD), in the form of C-to-T substitutions a...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2024-03-01
|
Series: | Scientific Reports |
Online Access: | https://doi.org/10.1038/s41598-024-56584-3 |
_version_ | 1797259326598938624 |
---|---|
author | Antonio Garrido Marques Simone Rubinacci Anna-Sapfo Malaspinas Olivier Delaneau Bárbara Sousa da Mota |
author_facet | Antonio Garrido Marques Simone Rubinacci Anna-Sapfo Malaspinas Olivier Delaneau Bárbara Sousa da Mota |
author_sort | Antonio Garrido Marques |
collection | DOAJ |
description | Abstract Low-coverage imputation is becoming ever more present in ancient DNA (aDNA) studies. Imputation pipelines commonly used for present-day genomes have been shown to yield accurate results when applied to ancient genomes. However, post-mortem damage (PMD), in the form of C-to-T substitutions at the reads termini, and contamination with DNA from closely related species can potentially affect imputation performance in aDNA. In this study, we evaluated imputation performance (i) when using a genotype caller designed for aDNA, ATLAS, compared to bcftools, and (ii) when contamination is present. We evaluated imputation performance with principal component analyses and by calculating imputation error rates. With a particular focus on differently imputed sites, we found that using ATLAS prior to imputation substantially improved imputed genotypes for a very damaged ancient genome (42% PMD). Trimming the ends of the sequencing reads led to similar improvements in imputation accuracy. For the remaining genomes, ATLAS brought limited gains. Finally, to examine the effect of contamination on imputation, we added various amounts of reads from two present-day genomes to a previously downsampled high-coverage ancient genome. We observed that imputation accuracy drastically decreased for contamination rates above 5%. In conclusion, we recommend (i) accounting for PMD by either trimming sequencing reads or using a genotype caller such as ATLAS before imputing highly damaged genomes and (ii) only imputing genomes containing up to 5% of contamination. |
first_indexed | 2024-04-24T23:07:39Z |
format | Article |
id | doaj.art-09ef1be7957a40298ef0e78f4d92473a |
institution | Directory Open Access Journal |
issn | 2045-2322 |
language | English |
last_indexed | 2024-04-24T23:07:39Z |
publishDate | 2024-03-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Reports |
spelling | doaj.art-09ef1be7957a40298ef0e78f4d92473a2024-03-17T12:23:19ZengNature PortfolioScientific Reports2045-23222024-03-0114111310.1038/s41598-024-56584-3Assessing the impact of post-mortem damage and contamination on imputation performance in ancient DNAAntonio Garrido Marques0Simone Rubinacci1Anna-Sapfo Malaspinas2Olivier Delaneau3Bárbara Sousa da Mota4Department of Computational Biology, University of LausanneDivision of Genetics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical SchoolDepartment of Computational Biology, University of LausanneRegeneron Genetics CenterDepartment of Computational Biology, University of LausanneAbstract Low-coverage imputation is becoming ever more present in ancient DNA (aDNA) studies. Imputation pipelines commonly used for present-day genomes have been shown to yield accurate results when applied to ancient genomes. However, post-mortem damage (PMD), in the form of C-to-T substitutions at the reads termini, and contamination with DNA from closely related species can potentially affect imputation performance in aDNA. In this study, we evaluated imputation performance (i) when using a genotype caller designed for aDNA, ATLAS, compared to bcftools, and (ii) when contamination is present. We evaluated imputation performance with principal component analyses and by calculating imputation error rates. With a particular focus on differently imputed sites, we found that using ATLAS prior to imputation substantially improved imputed genotypes for a very damaged ancient genome (42% PMD). Trimming the ends of the sequencing reads led to similar improvements in imputation accuracy. For the remaining genomes, ATLAS brought limited gains. Finally, to examine the effect of contamination on imputation, we added various amounts of reads from two present-day genomes to a previously downsampled high-coverage ancient genome. We observed that imputation accuracy drastically decreased for contamination rates above 5%. In conclusion, we recommend (i) accounting for PMD by either trimming sequencing reads or using a genotype caller such as ATLAS before imputing highly damaged genomes and (ii) only imputing genomes containing up to 5% of contamination.https://doi.org/10.1038/s41598-024-56584-3 |
spellingShingle | Antonio Garrido Marques Simone Rubinacci Anna-Sapfo Malaspinas Olivier Delaneau Bárbara Sousa da Mota Assessing the impact of post-mortem damage and contamination on imputation performance in ancient DNA Scientific Reports |
title | Assessing the impact of post-mortem damage and contamination on imputation performance in ancient DNA |
title_full | Assessing the impact of post-mortem damage and contamination on imputation performance in ancient DNA |
title_fullStr | Assessing the impact of post-mortem damage and contamination on imputation performance in ancient DNA |
title_full_unstemmed | Assessing the impact of post-mortem damage and contamination on imputation performance in ancient DNA |
title_short | Assessing the impact of post-mortem damage and contamination on imputation performance in ancient DNA |
title_sort | assessing the impact of post mortem damage and contamination on imputation performance in ancient dna |
url | https://doi.org/10.1038/s41598-024-56584-3 |
work_keys_str_mv | AT antoniogarridomarques assessingtheimpactofpostmortemdamageandcontaminationonimputationperformanceinancientdna AT simonerubinacci assessingtheimpactofpostmortemdamageandcontaminationonimputationperformanceinancientdna AT annasapfomalaspinas assessingtheimpactofpostmortemdamageandcontaminationonimputationperformanceinancientdna AT olivierdelaneau assessingtheimpactofpostmortemdamageandcontaminationonimputationperformanceinancientdna AT barbarasousadamota assessingtheimpactofpostmortemdamageandcontaminationonimputationperformanceinancientdna |