Assessing the impact of post-mortem damage and contamination on imputation performance in ancient DNA

Abstract Low-coverage imputation is becoming ever more present in ancient DNA (aDNA) studies. Imputation pipelines commonly used for present-day genomes have been shown to yield accurate results when applied to ancient genomes. However, post-mortem damage (PMD), in the form of C-to-T substitutions a...

Full description

Bibliographic Details
Main Authors: Antonio Garrido Marques, Simone Rubinacci, Anna-Sapfo Malaspinas, Olivier Delaneau, Bárbara Sousa da Mota
Format: Article
Language:English
Published: Nature Portfolio 2024-03-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-024-56584-3
_version_ 1797259326598938624
author Antonio Garrido Marques
Simone Rubinacci
Anna-Sapfo Malaspinas
Olivier Delaneau
Bárbara Sousa da Mota
author_facet Antonio Garrido Marques
Simone Rubinacci
Anna-Sapfo Malaspinas
Olivier Delaneau
Bárbara Sousa da Mota
author_sort Antonio Garrido Marques
collection DOAJ
description Abstract Low-coverage imputation is becoming ever more present in ancient DNA (aDNA) studies. Imputation pipelines commonly used for present-day genomes have been shown to yield accurate results when applied to ancient genomes. However, post-mortem damage (PMD), in the form of C-to-T substitutions at the reads termini, and contamination with DNA from closely related species can potentially affect imputation performance in aDNA. In this study, we evaluated imputation performance (i) when using a genotype caller designed for aDNA, ATLAS, compared to bcftools, and (ii) when contamination is present. We evaluated imputation performance with principal component analyses and by calculating imputation error rates. With a particular focus on differently imputed sites, we found that using ATLAS prior to imputation substantially improved imputed genotypes for a very damaged ancient genome (42% PMD). Trimming the ends of the sequencing reads led to similar improvements in imputation accuracy. For the remaining genomes, ATLAS brought limited gains. Finally, to examine the effect of contamination on imputation, we added various amounts of reads from two present-day genomes to a previously downsampled high-coverage ancient genome. We observed that imputation accuracy drastically decreased for contamination rates above 5%. In conclusion, we recommend (i) accounting for PMD by either trimming sequencing reads or using a genotype caller such as ATLAS before imputing highly damaged genomes and (ii) only imputing genomes containing up to 5% of contamination.
first_indexed 2024-04-24T23:07:39Z
format Article
id doaj.art-09ef1be7957a40298ef0e78f4d92473a
institution Directory Open Access Journal
issn 2045-2322
language English
last_indexed 2024-04-24T23:07:39Z
publishDate 2024-03-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj.art-09ef1be7957a40298ef0e78f4d92473a2024-03-17T12:23:19ZengNature PortfolioScientific Reports2045-23222024-03-0114111310.1038/s41598-024-56584-3Assessing the impact of post-mortem damage and contamination on imputation performance in ancient DNAAntonio Garrido Marques0Simone Rubinacci1Anna-Sapfo Malaspinas2Olivier Delaneau3Bárbara Sousa da Mota4Department of Computational Biology, University of LausanneDivision of Genetics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical SchoolDepartment of Computational Biology, University of LausanneRegeneron Genetics CenterDepartment of Computational Biology, University of LausanneAbstract Low-coverage imputation is becoming ever more present in ancient DNA (aDNA) studies. Imputation pipelines commonly used for present-day genomes have been shown to yield accurate results when applied to ancient genomes. However, post-mortem damage (PMD), in the form of C-to-T substitutions at the reads termini, and contamination with DNA from closely related species can potentially affect imputation performance in aDNA. In this study, we evaluated imputation performance (i) when using a genotype caller designed for aDNA, ATLAS, compared to bcftools, and (ii) when contamination is present. We evaluated imputation performance with principal component analyses and by calculating imputation error rates. With a particular focus on differently imputed sites, we found that using ATLAS prior to imputation substantially improved imputed genotypes for a very damaged ancient genome (42% PMD). Trimming the ends of the sequencing reads led to similar improvements in imputation accuracy. For the remaining genomes, ATLAS brought limited gains. Finally, to examine the effect of contamination on imputation, we added various amounts of reads from two present-day genomes to a previously downsampled high-coverage ancient genome. We observed that imputation accuracy drastically decreased for contamination rates above 5%. In conclusion, we recommend (i) accounting for PMD by either trimming sequencing reads or using a genotype caller such as ATLAS before imputing highly damaged genomes and (ii) only imputing genomes containing up to 5% of contamination.https://doi.org/10.1038/s41598-024-56584-3
spellingShingle Antonio Garrido Marques
Simone Rubinacci
Anna-Sapfo Malaspinas
Olivier Delaneau
Bárbara Sousa da Mota
Assessing the impact of post-mortem damage and contamination on imputation performance in ancient DNA
Scientific Reports
title Assessing the impact of post-mortem damage and contamination on imputation performance in ancient DNA
title_full Assessing the impact of post-mortem damage and contamination on imputation performance in ancient DNA
title_fullStr Assessing the impact of post-mortem damage and contamination on imputation performance in ancient DNA
title_full_unstemmed Assessing the impact of post-mortem damage and contamination on imputation performance in ancient DNA
title_short Assessing the impact of post-mortem damage and contamination on imputation performance in ancient DNA
title_sort assessing the impact of post mortem damage and contamination on imputation performance in ancient dna
url https://doi.org/10.1038/s41598-024-56584-3
work_keys_str_mv AT antoniogarridomarques assessingtheimpactofpostmortemdamageandcontaminationonimputationperformanceinancientdna
AT simonerubinacci assessingtheimpactofpostmortemdamageandcontaminationonimputationperformanceinancientdna
AT annasapfomalaspinas assessingtheimpactofpostmortemdamageandcontaminationonimputationperformanceinancientdna
AT olivierdelaneau assessingtheimpactofpostmortemdamageandcontaminationonimputationperformanceinancientdna
AT barbarasousadamota assessingtheimpactofpostmortemdamageandcontaminationonimputationperformanceinancientdna