Reference-Guided Draft Genome Assembly, Annotation and SSR Mining Data of the Peruvian Creole Cattle (<i>Bos taurus</i>)
The Peruvian creole cattle (PCC) is a neglected breed and an essential livestock resource in the Andean region of Peru. To develop a modern breeding program and conservation strategies for the PCC, a better understanding of the genetics of this breed is needed. We sequenced the whole genome of the P...
Main Authors: | , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-11-01
|
Series: | Data |
Subjects: | |
Online Access: | https://www.mdpi.com/2306-5729/7/11/155 |
_version_ | 1797468626133975040 |
---|---|
author | Richard Estrada Flor-Anita Corredor Deyanira Figueroa Wilian Salazar Carlos Quilcate Héctor V. Vásquez Jorge L. Maicelo Jhony Gonzales Carlos I. Arbizu |
author_facet | Richard Estrada Flor-Anita Corredor Deyanira Figueroa Wilian Salazar Carlos Quilcate Héctor V. Vásquez Jorge L. Maicelo Jhony Gonzales Carlos I. Arbizu |
author_sort | Richard Estrada |
collection | DOAJ |
description | The Peruvian creole cattle (PCC) is a neglected breed and an essential livestock resource in the Andean region of Peru. To develop a modern breeding program and conservation strategies for the PCC, a better understanding of the genetics of this breed is needed. We sequenced the whole genome of the PCC using a de novo assembly approach with a paired-end 150 strategy on the Illumina HiSeq 2500 platform, obtaining 320 GB of sequencing data. A reference scaffolding was used to improve the draft genome. The obtained genome size of the PCC was 2.81 Gb with a contig N50 of 108 Mb and 92.59% complete BUSCOs. This genome size is similar to the genome references of <i>Bos taurus</i> and <i>B. indicus</i>. In addition, we identified 40.22% of repetitive DNA of the genome assembly, of which retroelements occupy 32.39% of the total genome. A total of 19,803 protein-coding genes were annotated in the PCC genome. For SSR data mining, we detected similar statistics in comparison with other breeds. The PCC genome will contribute to a better understanding of the genetics of this species and its adaptation to tough conditions in the Andean ecosystem. |
first_indexed | 2024-03-09T19:09:07Z |
format | Article |
id | doaj.art-28d69e58765b4ea1b3c1c1cfb1f991f1 |
institution | Directory Open Access Journal |
issn | 2306-5729 |
language | English |
last_indexed | 2024-03-09T19:09:07Z |
publishDate | 2022-11-01 |
publisher | MDPI AG |
record_format | Article |
series | Data |
spelling | doaj.art-28d69e58765b4ea1b3c1c1cfb1f991f12023-11-24T04:16:58ZengMDPI AGData2306-57292022-11-0171115510.3390/data7110155Reference-Guided Draft Genome Assembly, Annotation and SSR Mining Data of the Peruvian Creole Cattle (<i>Bos taurus</i>)Richard Estrada0Flor-Anita Corredor1Deyanira Figueroa2Wilian Salazar3Carlos Quilcate4Héctor V. Vásquez5Jorge L. Maicelo6Jhony Gonzales7Carlos I. Arbizu8Dirección de Desarrollo Tecnológico Agrario, Instituto Nacional de Innovación Agraria (INIA), Av. La Molina 1981, Lima 15024, PeruDirección de Desarrollo Tecnológico Agrario, Instituto Nacional de Innovación Agraria (INIA), Av. La Molina 1981, Lima 15024, PeruDirección de Desarrollo Tecnológico Agrario, Instituto Nacional de Innovación Agraria (INIA), Av. La Molina 1981, Lima 15024, PeruDirección de Desarrollo Tecnológico Agrario, Instituto Nacional de Innovación Agraria (INIA), Av. La Molina 1981, Lima 15024, PeruDirección de Desarrollo Tecnológico Agrario, Instituto Nacional de Innovación Agraria (INIA), Av. La Molina 1981, Lima 15024, PeruDirección de Desarrollo Tecnológico Agrario, Instituto Nacional de Innovación Agraria (INIA), Av. La Molina 1981, Lima 15024, PeruDirección de Desarrollo Tecnológico Agrario, Instituto Nacional de Innovación Agraria (INIA), Av. La Molina 1981, Lima 15024, PeruLaboratorio de Biología Molecular, Universidad Nacional de Frontera, Av. San Hilarión 101, Sullana 20103, PeruDirección de Desarrollo Tecnológico Agrario, Instituto Nacional de Innovación Agraria (INIA), Av. La Molina 1981, Lima 15024, PeruThe Peruvian creole cattle (PCC) is a neglected breed and an essential livestock resource in the Andean region of Peru. To develop a modern breeding program and conservation strategies for the PCC, a better understanding of the genetics of this breed is needed. We sequenced the whole genome of the PCC using a de novo assembly approach with a paired-end 150 strategy on the Illumina HiSeq 2500 platform, obtaining 320 GB of sequencing data. A reference scaffolding was used to improve the draft genome. The obtained genome size of the PCC was 2.81 Gb with a contig N50 of 108 Mb and 92.59% complete BUSCOs. This genome size is similar to the genome references of <i>Bos taurus</i> and <i>B. indicus</i>. In addition, we identified 40.22% of repetitive DNA of the genome assembly, of which retroelements occupy 32.39% of the total genome. A total of 19,803 protein-coding genes were annotated in the PCC genome. For SSR data mining, we detected similar statistics in comparison with other breeds. The PCC genome will contribute to a better understanding of the genetics of this species and its adaptation to tough conditions in the Andean ecosystem.https://www.mdpi.com/2306-5729/7/11/155NGSneglected breedgenomereference scaffoldingmicrosatellites |
spellingShingle | Richard Estrada Flor-Anita Corredor Deyanira Figueroa Wilian Salazar Carlos Quilcate Héctor V. Vásquez Jorge L. Maicelo Jhony Gonzales Carlos I. Arbizu Reference-Guided Draft Genome Assembly, Annotation and SSR Mining Data of the Peruvian Creole Cattle (<i>Bos taurus</i>) Data NGS neglected breed genome reference scaffolding microsatellites |
title | Reference-Guided Draft Genome Assembly, Annotation and SSR Mining Data of the Peruvian Creole Cattle (<i>Bos taurus</i>) |
title_full | Reference-Guided Draft Genome Assembly, Annotation and SSR Mining Data of the Peruvian Creole Cattle (<i>Bos taurus</i>) |
title_fullStr | Reference-Guided Draft Genome Assembly, Annotation and SSR Mining Data of the Peruvian Creole Cattle (<i>Bos taurus</i>) |
title_full_unstemmed | Reference-Guided Draft Genome Assembly, Annotation and SSR Mining Data of the Peruvian Creole Cattle (<i>Bos taurus</i>) |
title_short | Reference-Guided Draft Genome Assembly, Annotation and SSR Mining Data of the Peruvian Creole Cattle (<i>Bos taurus</i>) |
title_sort | reference guided draft genome assembly annotation and ssr mining data of the peruvian creole cattle i bos taurus i |
topic | NGS neglected breed genome reference scaffolding microsatellites |
url | https://www.mdpi.com/2306-5729/7/11/155 |
work_keys_str_mv | AT richardestrada referenceguideddraftgenomeassemblyannotationandssrminingdataoftheperuviancreolecattleibostaurusi AT floranitacorredor referenceguideddraftgenomeassemblyannotationandssrminingdataoftheperuviancreolecattleibostaurusi AT deyanirafigueroa referenceguideddraftgenomeassemblyannotationandssrminingdataoftheperuviancreolecattleibostaurusi AT wiliansalazar referenceguideddraftgenomeassemblyannotationandssrminingdataoftheperuviancreolecattleibostaurusi AT carlosquilcate referenceguideddraftgenomeassemblyannotationandssrminingdataoftheperuviancreolecattleibostaurusi AT hectorvvasquez referenceguideddraftgenomeassemblyannotationandssrminingdataoftheperuviancreolecattleibostaurusi AT jorgelmaicelo referenceguideddraftgenomeassemblyannotationandssrminingdataoftheperuviancreolecattleibostaurusi AT jhonygonzales referenceguideddraftgenomeassemblyannotationandssrminingdataoftheperuviancreolecattleibostaurusi AT carlosiarbizu referenceguideddraftgenomeassemblyannotationandssrminingdataoftheperuviancreolecattleibostaurusi |