Dog10K_Boxer_Tasha_1.0: A Long-Read Assembly of the Dog Reference Genome

The domestic dog has evolved to be an important biomedical model for studies regarding the genetic basis of disease, morphology and behavior. Genetic studies in the dog have relied on a draft reference genome of a purebred female boxer dog named “Tasha” initially published in 2005. Derived from a Sa...

Full description

Bibliographic Details
Main Authors: Vidhya Jagannathan, Christophe Hitte, Jeffrey M. Kidd, Patrick Masterson, Terence D. Murphy, Sarah Emery, Brian Davis, Reuben M. Buckley, Yan-Hu Liu, Xiang-Quan Zhang, Tosso Leeb, Ya-Ping Zhang, Elaine A. Ostrander, Guo-Dong Wang
Format: Article
Language:English
Published: MDPI AG 2021-05-01
Series:Genes
Subjects:
Online Access:https://www.mdpi.com/2073-4425/12/6/847
_version_ 1797531987674660864
author Vidhya Jagannathan
Christophe Hitte
Jeffrey M. Kidd
Patrick Masterson
Terence D. Murphy
Sarah Emery
Brian Davis
Reuben M. Buckley
Yan-Hu Liu
Xiang-Quan Zhang
Tosso Leeb
Ya-Ping Zhang
Elaine A. Ostrander
Guo-Dong Wang
author_facet Vidhya Jagannathan
Christophe Hitte
Jeffrey M. Kidd
Patrick Masterson
Terence D. Murphy
Sarah Emery
Brian Davis
Reuben M. Buckley
Yan-Hu Liu
Xiang-Quan Zhang
Tosso Leeb
Ya-Ping Zhang
Elaine A. Ostrander
Guo-Dong Wang
author_sort Vidhya Jagannathan
collection DOAJ
description The domestic dog has evolved to be an important biomedical model for studies regarding the genetic basis of disease, morphology and behavior. Genetic studies in the dog have relied on a draft reference genome of a purebred female boxer dog named “Tasha” initially published in 2005. Derived from a Sanger whole genome shotgun sequencing approach coupled with limited clone-based sequencing, the initial assembly and subsequent updates have served as the predominant resource for canine genetics for 15 years. While the initial assembly produced a good-quality draft, as with all assemblies produced at the time, it contained gaps, assembly errors and missing sequences, particularly in GC-rich regions, which are found at many promoters and in the first exons of protein-coding genes. Here, we present Dog10K_Boxer_Tasha_1.0, an improved chromosome-level highly contiguous genome assembly of Tasha created with long-read technologies that increases sequence contiguity >100-fold, closes >23,000 gaps of the CanFam3.1 reference assembly and improves gene annotation by identifying >1200 new protein-coding transcripts. The assembly and annotation are available at NCBI under the accession GCF_000002285.5.
first_indexed 2024-03-10T10:52:38Z
format Article
id doaj.art-63a7b305aff34e4d8fea627a299facad
institution Directory Open Access Journal
issn 2073-4425
language English
last_indexed 2024-03-10T10:52:38Z
publishDate 2021-05-01
publisher MDPI AG
record_format Article
series Genes
spelling doaj.art-63a7b305aff34e4d8fea627a299facad2023-11-21T22:08:05ZengMDPI AGGenes2073-44252021-05-0112684710.3390/genes12060847Dog10K_Boxer_Tasha_1.0: A Long-Read Assembly of the Dog Reference GenomeVidhya Jagannathan0Christophe Hitte1Jeffrey M. Kidd2Patrick Masterson3Terence D. Murphy4Sarah Emery5Brian Davis6Reuben M. Buckley7Yan-Hu Liu8Xiang-Quan Zhang9Tosso Leeb10Ya-Ping Zhang11Elaine A. Ostrander12Guo-Dong Wang13Vetsuisse Faculty, Institute of Genetics, University of Bern, 3001 Bern, SwitzerlandInstitute Genetics Development Rennes, University of Rennes, CNRS—UMR 6290, F-35000 Rennes, FranceDepartment of Human Genetics, University of Michigan, Ann Arbor, MI 48109, USANational Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20984, USANational Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20984, USADepartment of Human Genetics, University of Michigan, Ann Arbor, MI 48109, USADepartment of Integrative Biological Sciences, Texas A and M University, College Station, TX 77840, USANational Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20984, USAState Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650201, ChinaState Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650201, ChinaVetsuisse Faculty, Institute of Genetics, University of Bern, 3001 Bern, SwitzerlandState Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650201, ChinaNational Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20984, USAState Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650201, ChinaThe domestic dog has evolved to be an important biomedical model for studies regarding the genetic basis of disease, morphology and behavior. Genetic studies in the dog have relied on a draft reference genome of a purebred female boxer dog named “Tasha” initially published in 2005. Derived from a Sanger whole genome shotgun sequencing approach coupled with limited clone-based sequencing, the initial assembly and subsequent updates have served as the predominant resource for canine genetics for 15 years. While the initial assembly produced a good-quality draft, as with all assemblies produced at the time, it contained gaps, assembly errors and missing sequences, particularly in GC-rich regions, which are found at many promoters and in the first exons of protein-coding genes. Here, we present Dog10K_Boxer_Tasha_1.0, an improved chromosome-level highly contiguous genome assembly of Tasha created with long-read technologies that increases sequence contiguity >100-fold, closes >23,000 gaps of the CanFam3.1 reference assembly and improves gene annotation by identifying >1200 new protein-coding transcripts. The assembly and annotation are available at NCBI under the accession GCF_000002285.5.https://www.mdpi.com/2073-4425/12/6/847<i>Canis lupus familiaris</i>high qualitycontiguityPacific biosciencesannotationresource
spellingShingle Vidhya Jagannathan
Christophe Hitte
Jeffrey M. Kidd
Patrick Masterson
Terence D. Murphy
Sarah Emery
Brian Davis
Reuben M. Buckley
Yan-Hu Liu
Xiang-Quan Zhang
Tosso Leeb
Ya-Ping Zhang
Elaine A. Ostrander
Guo-Dong Wang
Dog10K_Boxer_Tasha_1.0: A Long-Read Assembly of the Dog Reference Genome
Genes
<i>Canis lupus familiaris</i>
high quality
contiguity
Pacific biosciences
annotation
resource
title Dog10K_Boxer_Tasha_1.0: A Long-Read Assembly of the Dog Reference Genome
title_full Dog10K_Boxer_Tasha_1.0: A Long-Read Assembly of the Dog Reference Genome
title_fullStr Dog10K_Boxer_Tasha_1.0: A Long-Read Assembly of the Dog Reference Genome
title_full_unstemmed Dog10K_Boxer_Tasha_1.0: A Long-Read Assembly of the Dog Reference Genome
title_short Dog10K_Boxer_Tasha_1.0: A Long-Read Assembly of the Dog Reference Genome
title_sort dog10k boxer tasha 1 0 a long read assembly of the dog reference genome
topic <i>Canis lupus familiaris</i>
high quality
contiguity
Pacific biosciences
annotation
resource
url https://www.mdpi.com/2073-4425/12/6/847
work_keys_str_mv AT vidhyajagannathan dog10kboxertasha10alongreadassemblyofthedogreferencegenome
AT christophehitte dog10kboxertasha10alongreadassemblyofthedogreferencegenome
AT jeffreymkidd dog10kboxertasha10alongreadassemblyofthedogreferencegenome
AT patrickmasterson dog10kboxertasha10alongreadassemblyofthedogreferencegenome
AT terencedmurphy dog10kboxertasha10alongreadassemblyofthedogreferencegenome
AT sarahemery dog10kboxertasha10alongreadassemblyofthedogreferencegenome
AT briandavis dog10kboxertasha10alongreadassemblyofthedogreferencegenome
AT reubenmbuckley dog10kboxertasha10alongreadassemblyofthedogreferencegenome
AT yanhuliu dog10kboxertasha10alongreadassemblyofthedogreferencegenome
AT xiangquanzhang dog10kboxertasha10alongreadassemblyofthedogreferencegenome
AT tossoleeb dog10kboxertasha10alongreadassemblyofthedogreferencegenome
AT yapingzhang dog10kboxertasha10alongreadassemblyofthedogreferencegenome
AT elaineaostrander dog10kboxertasha10alongreadassemblyofthedogreferencegenome
AT guodongwang dog10kboxertasha10alongreadassemblyofthedogreferencegenome