Predictors of sequence capture in a large-scale anchored phylogenomics project

Next-generation sequencing (NGS) technologies have revolutionized phylogenomics by decreasing the cost and time required to generate sequence data from multiple markers or whole genomes. Further, the fragmented DNA of biological specimens collected decades ago can be sequenced with NGS, reducing the...

Full description

Bibliographic Details
Main Authors: Renato Nunes, Caroline Storer, Tenzing Doleck, Akito Y. Kawahara, Naomi E. Pierce, David J. Lohman
Format: Article
Language:English
Published: Frontiers Media S.A. 2022-11-01
Series:Frontiers in Ecology and Evolution
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fevo.2022.943361/full
_version_ 1828134496121126912
author Renato Nunes
Renato Nunes
Caroline Storer
Tenzing Doleck
Tenzing Doleck
Akito Y. Kawahara
Akito Y. Kawahara
Akito Y. Kawahara
Naomi E. Pierce
David J. Lohman
David J. Lohman
David J. Lohman
author_facet Renato Nunes
Renato Nunes
Caroline Storer
Tenzing Doleck
Tenzing Doleck
Akito Y. Kawahara
Akito Y. Kawahara
Akito Y. Kawahara
Naomi E. Pierce
David J. Lohman
David J. Lohman
David J. Lohman
author_sort Renato Nunes
collection DOAJ
description Next-generation sequencing (NGS) technologies have revolutionized phylogenomics by decreasing the cost and time required to generate sequence data from multiple markers or whole genomes. Further, the fragmented DNA of biological specimens collected decades ago can be sequenced with NGS, reducing the need for collecting fresh specimens. Sequence capture, also known as anchored hybrid enrichment, is a method to produce reduced representation libraries for NGS sequencing. The technique uses single-stranded oligonucleotide probes that hybridize with pre-selected regions of the genome that are sequenced via NGS, culminating in a dataset of numerous orthologous loci from multiple taxa. Phylogenetic analyses using these sequences have the potential to resolve deep and shallow phylogenetic relationships. Identifying the factors that affect sequence capture success could save time, money, and valuable specimens that might be destructively sampled despite low likelihood of sequencing success. We investigated the impacts of specimen age, preservation method, and DNA concentration on sequence capture (number of captured sequences and sequence quality) while accounting for taxonomy and extracted tissue type in a large-scale butterfly phylogenomics project. This project used two probe sets to extract 391 loci or a subset of 13 loci from over 6,000 butterfly specimens. We found that sequence capture is a resilient method capable of amplifying loci in samples of varying age (0–111 years), preservation method (alcohol, papered, pinned), and DNA concentration (0.020 ng/μl - 316 ng/ul). Regression analyses demonstrate that sequence capture is positively correlated with DNA concentration. However, sequence capture and DNA concentration are negatively correlated with sample age and preservation method. Our findings suggest that sequence capture projects should prioritize the use of alcohol-preserved samples younger than 20 years old when available. In the absence of such specimens, dried samples of any age can yield sequence data, albeit with returns that diminish with increasing age.
first_indexed 2024-04-11T17:35:24Z
format Article
id doaj.art-38db2ffb841545dbbf3d681216a64c9c
institution Directory Open Access Journal
issn 2296-701X
language English
last_indexed 2024-04-11T17:35:24Z
publishDate 2022-11-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Ecology and Evolution
spelling doaj.art-38db2ffb841545dbbf3d681216a64c9c2022-12-22T04:11:37ZengFrontiers Media S.A.Frontiers in Ecology and Evolution2296-701X2022-11-011010.3389/fevo.2022.943361943361Predictors of sequence capture in a large-scale anchored phylogenomics projectRenato Nunes0Renato Nunes1Caroline Storer2Tenzing Doleck3Tenzing Doleck4Akito Y. Kawahara5Akito Y. Kawahara6Akito Y. Kawahara7Naomi E. Pierce8David J. Lohman9David J. Lohman10David J. Lohman11Biology Department, City College of New York, City University of New York, New York, NY, United StatesPhD Program in Biology, Graduate Center, City University of New York, New York, NY, United StatesMcGuire Center for Lepidoptera and Biodiversity, Florida Museum of Natural History, University of Florida, Gainesville, FL, United StatesBiology Department, City College of New York, City University of New York, New York, NY, United StatesPhD Program in Biology, Graduate Center, City University of New York, New York, NY, United StatesMcGuire Center for Lepidoptera and Biodiversity, Florida Museum of Natural History, University of Florida, Gainesville, FL, United StatesEntomology and Nematology Department, University of Florida, Gainesville, FL, United StatesDepartment of Biology, University of Florida, Gainesville, FL, United StatesDepartment of Organismic and Evolutionary Biology and Museum of Comparative Zoology, Harvard University, Cambridge, MA, United StatesBiology Department, City College of New York, City University of New York, New York, NY, United StatesPhD Program in Biology, Graduate Center, City University of New York, New York, NY, United StatesEntomology Section, National Museum of Natural History, Manila, PhilippinesNext-generation sequencing (NGS) technologies have revolutionized phylogenomics by decreasing the cost and time required to generate sequence data from multiple markers or whole genomes. Further, the fragmented DNA of biological specimens collected decades ago can be sequenced with NGS, reducing the need for collecting fresh specimens. Sequence capture, also known as anchored hybrid enrichment, is a method to produce reduced representation libraries for NGS sequencing. The technique uses single-stranded oligonucleotide probes that hybridize with pre-selected regions of the genome that are sequenced via NGS, culminating in a dataset of numerous orthologous loci from multiple taxa. Phylogenetic analyses using these sequences have the potential to resolve deep and shallow phylogenetic relationships. Identifying the factors that affect sequence capture success could save time, money, and valuable specimens that might be destructively sampled despite low likelihood of sequencing success. We investigated the impacts of specimen age, preservation method, and DNA concentration on sequence capture (number of captured sequences and sequence quality) while accounting for taxonomy and extracted tissue type in a large-scale butterfly phylogenomics project. This project used two probe sets to extract 391 loci or a subset of 13 loci from over 6,000 butterfly specimens. We found that sequence capture is a resilient method capable of amplifying loci in samples of varying age (0–111 years), preservation method (alcohol, papered, pinned), and DNA concentration (0.020 ng/μl - 316 ng/ul). Regression analyses demonstrate that sequence capture is positively correlated with DNA concentration. However, sequence capture and DNA concentration are negatively correlated with sample age and preservation method. Our findings suggest that sequence capture projects should prioritize the use of alcohol-preserved samples younger than 20 years old when available. In the absence of such specimens, dried samples of any age can yield sequence data, albeit with returns that diminish with increasing age.https://www.frontiersin.org/articles/10.3389/fevo.2022.943361/fullanchored hybrid enrichmenthistorical DNAhybrid captureLepidopteramuseomicsarchival DNA
spellingShingle Renato Nunes
Renato Nunes
Caroline Storer
Tenzing Doleck
Tenzing Doleck
Akito Y. Kawahara
Akito Y. Kawahara
Akito Y. Kawahara
Naomi E. Pierce
David J. Lohman
David J. Lohman
David J. Lohman
Predictors of sequence capture in a large-scale anchored phylogenomics project
Frontiers in Ecology and Evolution
anchored hybrid enrichment
historical DNA
hybrid capture
Lepidoptera
museomics
archival DNA
title Predictors of sequence capture in a large-scale anchored phylogenomics project
title_full Predictors of sequence capture in a large-scale anchored phylogenomics project
title_fullStr Predictors of sequence capture in a large-scale anchored phylogenomics project
title_full_unstemmed Predictors of sequence capture in a large-scale anchored phylogenomics project
title_short Predictors of sequence capture in a large-scale anchored phylogenomics project
title_sort predictors of sequence capture in a large scale anchored phylogenomics project
topic anchored hybrid enrichment
historical DNA
hybrid capture
Lepidoptera
museomics
archival DNA
url https://www.frontiersin.org/articles/10.3389/fevo.2022.943361/full
work_keys_str_mv AT renatonunes predictorsofsequencecaptureinalargescaleanchoredphylogenomicsproject
AT renatonunes predictorsofsequencecaptureinalargescaleanchoredphylogenomicsproject
AT carolinestorer predictorsofsequencecaptureinalargescaleanchoredphylogenomicsproject
AT tenzingdoleck predictorsofsequencecaptureinalargescaleanchoredphylogenomicsproject
AT tenzingdoleck predictorsofsequencecaptureinalargescaleanchoredphylogenomicsproject
AT akitoykawahara predictorsofsequencecaptureinalargescaleanchoredphylogenomicsproject
AT akitoykawahara predictorsofsequencecaptureinalargescaleanchoredphylogenomicsproject
AT akitoykawahara predictorsofsequencecaptureinalargescaleanchoredphylogenomicsproject
AT naomiepierce predictorsofsequencecaptureinalargescaleanchoredphylogenomicsproject
AT davidjlohman predictorsofsequencecaptureinalargescaleanchoredphylogenomicsproject
AT davidjlohman predictorsofsequencecaptureinalargescaleanchoredphylogenomicsproject
AT davidjlohman predictorsofsequencecaptureinalargescaleanchoredphylogenomicsproject