STAC: A tool to leverage genetic marker data for crop research and breeding

Abstract As genotyping by sequencing (GBS) becomes more prevalent and cost‐effective, there is a benefit in being able to apply the data to solve a variety of problems. However, high degrees of missing data and overreliance on single nucleotide polymorphisms (SNPs), while ignoring other forms of gen...

Full description

Bibliographic Details
Main Authors: Scott Carle, Alecia Kiszonas, Kimberly Garland‐Campbell, Craig F. Morris
Format: Article
Language:English
Published: Wiley 2023-12-01
Series:Agrosystems, Geosciences & Environment
Online Access:https://doi.org/10.1002/agg2.20436
_version_ 1827582925634273280
author Scott Carle
Alecia Kiszonas
Kimberly Garland‐Campbell
Craig F. Morris
author_facet Scott Carle
Alecia Kiszonas
Kimberly Garland‐Campbell
Craig F. Morris
author_sort Scott Carle
collection DOAJ
description Abstract As genotyping by sequencing (GBS) becomes more prevalent and cost‐effective, there is a benefit in being able to apply the data to solve a variety of problems. However, high degrees of missing data and overreliance on single nucleotide polymorphisms (SNPs), while ignoring other forms of genetic variation, frequently plague attempts to make full use of GBS sequence data. Here we have developed two R scripts to serve as a tool in haplotype determination at loci of interest within biparental populations. One of these scripts, Sparse Tag Allele Caller (STAC), provides both automated calling and visual representations of the data around a locus of interest to assist in rapid data compilation decision‐making. The other script, STAC Integrate, allows automated quality control and logic‐based integration of presence/absence data with SNP data, while also rendering global overviews of recombination and coverage across the genome. These scripts are designed to be used together to maximize the utility of the available data. These tools were validated on a biparental population of wheat that was genotyped through GBS. They successfully enabled haplotype determination of a locus that was difficult to directly genotype, and their systemic accuracy was demonstrated in multiple populations and species. These scripts may serve as a tool for researchers attempting to make better use of GBS and other genetic marker data for both research and crop breeding decisions.
first_indexed 2024-03-08T22:58:33Z
format Article
id doaj.art-819261c164b648c590913fecb80b7039
institution Directory Open Access Journal
issn 2639-6696
language English
last_indexed 2024-03-08T22:58:33Z
publishDate 2023-12-01
publisher Wiley
record_format Article
series Agrosystems, Geosciences & Environment
spelling doaj.art-819261c164b648c590913fecb80b70392023-12-16T02:28:30ZengWileyAgrosystems, Geosciences & Environment2639-66962023-12-0164n/an/a10.1002/agg2.20436STAC: A tool to leverage genetic marker data for crop research and breedingScott Carle0Alecia Kiszonas1Kimberly Garland‐Campbell2Craig F. Morris3Department of Crop and Soil Sciences Washington State University Pullman Washington USAUSDA‐ARS Western Wheat Quality Laboratory Washington State University Pullman Washington USAUSDA‐ARS Wheat Health, Genetics and Quality Research Unit Washington State University Pullman Washington USAUSDA‐ARS Western Wheat Quality Laboratory Washington State University Pullman Washington USAAbstract As genotyping by sequencing (GBS) becomes more prevalent and cost‐effective, there is a benefit in being able to apply the data to solve a variety of problems. However, high degrees of missing data and overreliance on single nucleotide polymorphisms (SNPs), while ignoring other forms of genetic variation, frequently plague attempts to make full use of GBS sequence data. Here we have developed two R scripts to serve as a tool in haplotype determination at loci of interest within biparental populations. One of these scripts, Sparse Tag Allele Caller (STAC), provides both automated calling and visual representations of the data around a locus of interest to assist in rapid data compilation decision‐making. The other script, STAC Integrate, allows automated quality control and logic‐based integration of presence/absence data with SNP data, while also rendering global overviews of recombination and coverage across the genome. These scripts are designed to be used together to maximize the utility of the available data. These tools were validated on a biparental population of wheat that was genotyped through GBS. They successfully enabled haplotype determination of a locus that was difficult to directly genotype, and their systemic accuracy was demonstrated in multiple populations and species. These scripts may serve as a tool for researchers attempting to make better use of GBS and other genetic marker data for both research and crop breeding decisions.https://doi.org/10.1002/agg2.20436
spellingShingle Scott Carle
Alecia Kiszonas
Kimberly Garland‐Campbell
Craig F. Morris
STAC: A tool to leverage genetic marker data for crop research and breeding
Agrosystems, Geosciences & Environment
title STAC: A tool to leverage genetic marker data for crop research and breeding
title_full STAC: A tool to leverage genetic marker data for crop research and breeding
title_fullStr STAC: A tool to leverage genetic marker data for crop research and breeding
title_full_unstemmed STAC: A tool to leverage genetic marker data for crop research and breeding
title_short STAC: A tool to leverage genetic marker data for crop research and breeding
title_sort stac a tool to leverage genetic marker data for crop research and breeding
url https://doi.org/10.1002/agg2.20436
work_keys_str_mv AT scottcarle stacatooltoleveragegeneticmarkerdataforcropresearchandbreeding
AT aleciakiszonas stacatooltoleveragegeneticmarkerdataforcropresearchandbreeding
AT kimberlygarlandcampbell stacatooltoleveragegeneticmarkerdataforcropresearchandbreeding
AT craigfmorris stacatooltoleveragegeneticmarkerdataforcropresearchandbreeding