Reference-Guided Draft Genome Assembly, Annotation and SSR Mining Data of the Peruvian Creole Cattle (<i>Bos taurus</i>)

The Peruvian creole cattle (PCC) is a neglected breed and an essential livestock resource in the Andean region of Peru. To develop a modern breeding program and conservation strategies for the PCC, a better understanding of the genetics of this breed is needed. We sequenced the whole genome of the P...

Full description

Bibliographic Details
Main Authors: Richard Estrada, Flor-Anita Corredor, Deyanira Figueroa, Wilian Salazar, Carlos Quilcate, Héctor V. Vásquez, Jorge L. Maicelo, Jhony Gonzales, Carlos I. Arbizu
Format: Article
Language:English
Published: MDPI AG 2022-11-01
Series:Data
Subjects:
Online Access:https://www.mdpi.com/2306-5729/7/11/155
_version_ 1797468626133975040
author Richard Estrada
Flor-Anita Corredor
Deyanira Figueroa
Wilian Salazar
Carlos Quilcate
Héctor V. Vásquez
Jorge L. Maicelo
Jhony Gonzales
Carlos I. Arbizu
author_facet Richard Estrada
Flor-Anita Corredor
Deyanira Figueroa
Wilian Salazar
Carlos Quilcate
Héctor V. Vásquez
Jorge L. Maicelo
Jhony Gonzales
Carlos I. Arbizu
author_sort Richard Estrada
collection DOAJ
description The Peruvian creole cattle (PCC) is a neglected breed and an essential livestock resource in the Andean region of Peru. To develop a modern breeding program and conservation strategies for the PCC, a better understanding of the genetics of this breed is needed. We sequenced the whole genome of the PCC using a de novo assembly approach with a paired-end 150 strategy on the Illumina HiSeq 2500 platform, obtaining 320 GB of sequencing data. A reference scaffolding was used to improve the draft genome. The obtained genome size of the PCC was 2.81 Gb with a contig N50 of 108 Mb and 92.59% complete BUSCOs. This genome size is similar to the genome references of <i>Bos taurus</i> and <i>B. indicus</i>. In addition, we identified 40.22% of repetitive DNA of the genome assembly, of which retroelements occupy 32.39% of the total genome. A total of 19,803 protein-coding genes were annotated in the PCC genome. For SSR data mining, we detected similar statistics in comparison with other breeds. The PCC genome will contribute to a better understanding of the genetics of this species and its adaptation to tough conditions in the Andean ecosystem.
first_indexed 2024-03-09T19:09:07Z
format Article
id doaj.art-28d69e58765b4ea1b3c1c1cfb1f991f1
institution Directory Open Access Journal
issn 2306-5729
language English
last_indexed 2024-03-09T19:09:07Z
publishDate 2022-11-01
publisher MDPI AG
record_format Article
series Data
spelling doaj.art-28d69e58765b4ea1b3c1c1cfb1f991f12023-11-24T04:16:58ZengMDPI AGData2306-57292022-11-0171115510.3390/data7110155Reference-Guided Draft Genome Assembly, Annotation and SSR Mining Data of the Peruvian Creole Cattle (<i>Bos taurus</i>)Richard Estrada0Flor-Anita Corredor1Deyanira Figueroa2Wilian Salazar3Carlos Quilcate4Héctor V. Vásquez5Jorge L. Maicelo6Jhony Gonzales7Carlos I. Arbizu8Dirección de Desarrollo Tecnológico Agrario, Instituto Nacional de Innovación Agraria (INIA), Av. La Molina 1981, Lima 15024, PeruDirección de Desarrollo Tecnológico Agrario, Instituto Nacional de Innovación Agraria (INIA), Av. La Molina 1981, Lima 15024, PeruDirección de Desarrollo Tecnológico Agrario, Instituto Nacional de Innovación Agraria (INIA), Av. La Molina 1981, Lima 15024, PeruDirección de Desarrollo Tecnológico Agrario, Instituto Nacional de Innovación Agraria (INIA), Av. La Molina 1981, Lima 15024, PeruDirección de Desarrollo Tecnológico Agrario, Instituto Nacional de Innovación Agraria (INIA), Av. La Molina 1981, Lima 15024, PeruDirección de Desarrollo Tecnológico Agrario, Instituto Nacional de Innovación Agraria (INIA), Av. La Molina 1981, Lima 15024, PeruDirección de Desarrollo Tecnológico Agrario, Instituto Nacional de Innovación Agraria (INIA), Av. La Molina 1981, Lima 15024, PeruLaboratorio de Biología Molecular, Universidad Nacional de Frontera, Av. San Hilarión 101, Sullana 20103, PeruDirección de Desarrollo Tecnológico Agrario, Instituto Nacional de Innovación Agraria (INIA), Av. La Molina 1981, Lima 15024, PeruThe Peruvian creole cattle (PCC) is a neglected breed and an essential livestock resource in the Andean region of Peru. To develop a modern breeding program and conservation strategies for the PCC, a better understanding of the genetics of this breed is needed. We sequenced the whole genome of the PCC using a de novo assembly approach with a paired-end 150 strategy on the Illumina HiSeq 2500 platform, obtaining 320 GB of sequencing data. A reference scaffolding was used to improve the draft genome. The obtained genome size of the PCC was 2.81 Gb with a contig N50 of 108 Mb and 92.59% complete BUSCOs. This genome size is similar to the genome references of <i>Bos taurus</i> and <i>B. indicus</i>. In addition, we identified 40.22% of repetitive DNA of the genome assembly, of which retroelements occupy 32.39% of the total genome. A total of 19,803 protein-coding genes were annotated in the PCC genome. For SSR data mining, we detected similar statistics in comparison with other breeds. The PCC genome will contribute to a better understanding of the genetics of this species and its adaptation to tough conditions in the Andean ecosystem.https://www.mdpi.com/2306-5729/7/11/155NGSneglected breedgenomereference scaffoldingmicrosatellites
spellingShingle Richard Estrada
Flor-Anita Corredor
Deyanira Figueroa
Wilian Salazar
Carlos Quilcate
Héctor V. Vásquez
Jorge L. Maicelo
Jhony Gonzales
Carlos I. Arbizu
Reference-Guided Draft Genome Assembly, Annotation and SSR Mining Data of the Peruvian Creole Cattle (<i>Bos taurus</i>)
Data
NGS
neglected breed
genome
reference scaffolding
microsatellites
title Reference-Guided Draft Genome Assembly, Annotation and SSR Mining Data of the Peruvian Creole Cattle (<i>Bos taurus</i>)
title_full Reference-Guided Draft Genome Assembly, Annotation and SSR Mining Data of the Peruvian Creole Cattle (<i>Bos taurus</i>)
title_fullStr Reference-Guided Draft Genome Assembly, Annotation and SSR Mining Data of the Peruvian Creole Cattle (<i>Bos taurus</i>)
title_full_unstemmed Reference-Guided Draft Genome Assembly, Annotation and SSR Mining Data of the Peruvian Creole Cattle (<i>Bos taurus</i>)
title_short Reference-Guided Draft Genome Assembly, Annotation and SSR Mining Data of the Peruvian Creole Cattle (<i>Bos taurus</i>)
title_sort reference guided draft genome assembly annotation and ssr mining data of the peruvian creole cattle i bos taurus i
topic NGS
neglected breed
genome
reference scaffolding
microsatellites
url https://www.mdpi.com/2306-5729/7/11/155
work_keys_str_mv AT richardestrada referenceguideddraftgenomeassemblyannotationandssrminingdataoftheperuviancreolecattleibostaurusi
AT floranitacorredor referenceguideddraftgenomeassemblyannotationandssrminingdataoftheperuviancreolecattleibostaurusi
AT deyanirafigueroa referenceguideddraftgenomeassemblyannotationandssrminingdataoftheperuviancreolecattleibostaurusi
AT wiliansalazar referenceguideddraftgenomeassemblyannotationandssrminingdataoftheperuviancreolecattleibostaurusi
AT carlosquilcate referenceguideddraftgenomeassemblyannotationandssrminingdataoftheperuviancreolecattleibostaurusi
AT hectorvvasquez referenceguideddraftgenomeassemblyannotationandssrminingdataoftheperuviancreolecattleibostaurusi
AT jorgelmaicelo referenceguideddraftgenomeassemblyannotationandssrminingdataoftheperuviancreolecattleibostaurusi
AT jhonygonzales referenceguideddraftgenomeassemblyannotationandssrminingdataoftheperuviancreolecattleibostaurusi
AT carlosiarbizu referenceguideddraftgenomeassemblyannotationandssrminingdataoftheperuviancreolecattleibostaurusi