A fully automated pipeline for quantitative genotype calling from next generation sequencing data in autopolyploids

Abstract Background Genotyping-by-sequencing (GBS) has been used broadly in genetic studies for several species, especially those with agricultural importance. However, its use is still limited in autopolyploid species because genotype calling software generally fails to properly distinguish heteroz...

Full description

Bibliographic Details
Main Authors: Guilherme S. Pereira, Antonio Augusto F. Garcia, Gabriel R. A. Margarido
Format: Article
Language:English
Published: BMC 2018-11-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-018-2433-6
_version_ 1818544464995549184
author Guilherme S. Pereira
Antonio Augusto F. Garcia
Gabriel R. A. Margarido
author_facet Guilherme S. Pereira
Antonio Augusto F. Garcia
Gabriel R. A. Margarido
author_sort Guilherme S. Pereira
collection DOAJ
description Abstract Background Genotyping-by-sequencing (GBS) has been used broadly in genetic studies for several species, especially those with agricultural importance. However, its use is still limited in autopolyploid species because genotype calling software generally fails to properly distinguish heterozygous classes based on allele dosage. Results VCF2SM is a Python script that integrates sequencing depth information of polymorphisms in variant call format (VCF) files and SuperMASSA software for quantitative genotype calling. VCFs can be obtained from any variant discovery software that outputs exact allele sequencing depth, such as a modified version of the Tassel-GBS pipeline provided here. VCF2SM was successfully applied in analyzing GBS data from diverse panels (alfalfa and potato) and full-sib mapping populations (alfalfa and switchgrass) of polyploid species. Conclusions We demonstrate that our approach can help plant geneticists working with autopolyploid species to advance their studies by distinguishing allele dosage from GBS data.
first_indexed 2024-12-11T22:48:50Z
format Article
id doaj.art-e674f87963e046d2a5f56c95d5007bee
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-11T22:48:50Z
publishDate 2018-11-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-e674f87963e046d2a5f56c95d5007bee2022-12-22T00:47:32ZengBMCBMC Bioinformatics1471-21052018-11-0119111010.1186/s12859-018-2433-6A fully automated pipeline for quantitative genotype calling from next generation sequencing data in autopolyploidsGuilherme S. Pereira0Antonio Augusto F. Garcia1Gabriel R. A. Margarido2University of São Paulo, “Luiz de Queiroz” College of Agriculture, Department of GeneticsUniversity of São Paulo, “Luiz de Queiroz” College of Agriculture, Department of GeneticsUniversity of São Paulo, “Luiz de Queiroz” College of Agriculture, Department of GeneticsAbstract Background Genotyping-by-sequencing (GBS) has been used broadly in genetic studies for several species, especially those with agricultural importance. However, its use is still limited in autopolyploid species because genotype calling software generally fails to properly distinguish heterozygous classes based on allele dosage. Results VCF2SM is a Python script that integrates sequencing depth information of polymorphisms in variant call format (VCF) files and SuperMASSA software for quantitative genotype calling. VCFs can be obtained from any variant discovery software that outputs exact allele sequencing depth, such as a modified version of the Tassel-GBS pipeline provided here. VCF2SM was successfully applied in analyzing GBS data from diverse panels (alfalfa and potato) and full-sib mapping populations (alfalfa and switchgrass) of polyploid species. Conclusions We demonstrate that our approach can help plant geneticists working with autopolyploid species to advance their studies by distinguishing allele dosage from GBS data.http://link.springer.com/article/10.1186/s12859-018-2433-6Genotyping-by-sequencingPloidy estimationAllele dosagePopulation structureLinkage mappingGWAS
spellingShingle Guilherme S. Pereira
Antonio Augusto F. Garcia
Gabriel R. A. Margarido
A fully automated pipeline for quantitative genotype calling from next generation sequencing data in autopolyploids
BMC Bioinformatics
Genotyping-by-sequencing
Ploidy estimation
Allele dosage
Population structure
Linkage mapping
GWAS
title A fully automated pipeline for quantitative genotype calling from next generation sequencing data in autopolyploids
title_full A fully automated pipeline for quantitative genotype calling from next generation sequencing data in autopolyploids
title_fullStr A fully automated pipeline for quantitative genotype calling from next generation sequencing data in autopolyploids
title_full_unstemmed A fully automated pipeline for quantitative genotype calling from next generation sequencing data in autopolyploids
title_short A fully automated pipeline for quantitative genotype calling from next generation sequencing data in autopolyploids
title_sort fully automated pipeline for quantitative genotype calling from next generation sequencing data in autopolyploids
topic Genotyping-by-sequencing
Ploidy estimation
Allele dosage
Population structure
Linkage mapping
GWAS
url http://link.springer.com/article/10.1186/s12859-018-2433-6
work_keys_str_mv AT guilhermespereira afullyautomatedpipelineforquantitativegenotypecallingfromnextgenerationsequencingdatainautopolyploids
AT antonioaugustofgarcia afullyautomatedpipelineforquantitativegenotypecallingfromnextgenerationsequencingdatainautopolyploids
AT gabrielramargarido afullyautomatedpipelineforquantitativegenotypecallingfromnextgenerationsequencingdatainautopolyploids
AT guilhermespereira fullyautomatedpipelineforquantitativegenotypecallingfromnextgenerationsequencingdatainautopolyploids
AT antonioaugustofgarcia fullyautomatedpipelineforquantitativegenotypecallingfromnextgenerationsequencingdatainautopolyploids
AT gabrielramargarido fullyautomatedpipelineforquantitativegenotypecallingfromnextgenerationsequencingdatainautopolyploids