GVCHAP: A Computing Pipeline for Genomic Prediction and Variance Component Estimation Using Haplotypes and SNP Markers

Haplotype prediction models open many possibilities to improve the accuracy of genomic selection but require more data processing and computing time than single-SNP prediction models. To facilitate haplotype analysis for genomic prediction and estimation using structural and functional genomic infor...

Full description

Bibliographic Details
Main Authors:	Dzianis Prakapenka, Chunkao Wang, Zuoxiang Liang, Cheng Bian, Cheng Tan, Yang Da
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2020-04-01
Series:	Frontiers in Genetics
Subjects:	genomic selection haplotype SNP heritability prediction accuracy
Online Access:	https://www.frontiersin.org/article/10.3389/fgene.2020.00282/full

_version_	1818854990991589376
author	Dzianis Prakapenka Chunkao Wang Zuoxiang Liang Cheng Bian Cheng Bian Cheng Tan Cheng Tan Yang Da
author_facet	Dzianis Prakapenka Chunkao Wang Zuoxiang Liang Cheng Bian Cheng Bian Cheng Tan Cheng Tan Yang Da
author_sort	Dzianis Prakapenka
collection	DOAJ
description	Haplotype prediction models open many possibilities to improve the accuracy of genomic selection but require more data processing and computing time than single-SNP prediction models. To facilitate haplotype analysis for genomic prediction and estimation using structural and functional genomic information, we developed a computing pipeline to implement haplotype analysis with capabilities for preparation of input data for haplotype analysis, genomic prediction and estimation using GVCHAP, and analysis of GVCHAP results. Data preparation includes utility programs for haplotype imputing; defining haplotype blocks by a fixed number of SNPs, a fixed distance in base pairs per block, or user defined block lengths based on structural or functional genomic information or a mixture of both types of information; and defining haplotype genotypes within each haplotype block. GVCHAP is the main program for genomic prediction and estimation, calculates GREML (genomic restricted maximum likelihood) estimates of variance components and heritabilities, and calculates GBLUP (genomic best linear unbiased prediction) for additive and dominance values of single SNPs as well as additive values of haplotypes with reliability estimates for training and validation populations. A two-step strategy and a method of multi-node processing are implemented to remove the computing bottleneck due to the creation of genomic relationship matrices for large samples. The analysis of GVCHAP results includes calculation of observed prediction accuracies from validation studies and preparation of input files for graphical visualization of heritability estimates of haplotype blocks as well as estimates of SNP effects and heritabilities. The entire pipeline provides an efficient and versatile computing tool for identifying the most accurate haplotype model among many candidate haplotype models utilizing structural and functional genomic information for genomic selection.
first_indexed	2024-12-19T08:01:30Z
format	Article
id	doaj.art-e2af11008bc942ccae5f5f1e58fac7b2
institution	Directory Open Access Journal
issn	1664-8021
language	English
last_indexed	2024-12-19T08:01:30Z
publishDate	2020-04-01
publisher	Frontiers Media S.A.
record_format	Article
series	Frontiers in Genetics
spelling	doaj.art-e2af11008bc942ccae5f5f1e58fac7b22022-12-21T20:29:51ZengFrontiers Media S.A.Frontiers in Genetics1664-80212020-04-011110.3389/fgene.2020.00282515392GVCHAP: A Computing Pipeline for Genomic Prediction and Variance Component Estimation Using Haplotypes and SNP MarkersDzianis Prakapenka0Chunkao Wang1Zuoxiang Liang2Cheng Bian3Cheng Bian4Cheng Tan5Cheng Tan6Yang Da7Department of Animal Science, University of Minnesota, Saint Paul, MN, United StatesDepartment of Animal Science, University of Minnesota, Saint Paul, MN, United StatesDepartment of Animal Science, University of Minnesota, Saint Paul, MN, United StatesDepartment of Animal Science, University of Minnesota, Saint Paul, MN, United StatesState Key Laboratory for Agrobiotechnology, China Agricultural University, Beijing, ChinaDepartment of Animal Science, University of Minnesota, Saint Paul, MN, United StatesNational Engineering Research Center for Breeding Swine Industry, South China Agricultural University, Guangzhou, ChinaDepartment of Animal Science, University of Minnesota, Saint Paul, MN, United StatesHaplotype prediction models open many possibilities to improve the accuracy of genomic selection but require more data processing and computing time than single-SNP prediction models. To facilitate haplotype analysis for genomic prediction and estimation using structural and functional genomic information, we developed a computing pipeline to implement haplotype analysis with capabilities for preparation of input data for haplotype analysis, genomic prediction and estimation using GVCHAP, and analysis of GVCHAP results. Data preparation includes utility programs for haplotype imputing; defining haplotype blocks by a fixed number of SNPs, a fixed distance in base pairs per block, or user defined block lengths based on structural or functional genomic information or a mixture of both types of information; and defining haplotype genotypes within each haplotype block. GVCHAP is the main program for genomic prediction and estimation, calculates GREML (genomic restricted maximum likelihood) estimates of variance components and heritabilities, and calculates GBLUP (genomic best linear unbiased prediction) for additive and dominance values of single SNPs as well as additive values of haplotypes with reliability estimates for training and validation populations. A two-step strategy and a method of multi-node processing are implemented to remove the computing bottleneck due to the creation of genomic relationship matrices for large samples. The analysis of GVCHAP results includes calculation of observed prediction accuracies from validation studies and preparation of input files for graphical visualization of heritability estimates of haplotype blocks as well as estimates of SNP effects and heritabilities. The entire pipeline provides an efficient and versatile computing tool for identifying the most accurate haplotype model among many candidate haplotype models utilizing structural and functional genomic information for genomic selection.https://www.frontiersin.org/article/10.3389/fgene.2020.00282/fullgenomic selectionhaplotypeSNPheritabilityprediction accuracy
spellingShingle	Dzianis Prakapenka Chunkao Wang Zuoxiang Liang Cheng Bian Cheng Bian Cheng Tan Cheng Tan Yang Da GVCHAP: A Computing Pipeline for Genomic Prediction and Variance Component Estimation Using Haplotypes and SNP Markers Frontiers in Genetics genomic selection haplotype SNP heritability prediction accuracy
title	GVCHAP: A Computing Pipeline for Genomic Prediction and Variance Component Estimation Using Haplotypes and SNP Markers
title_full	GVCHAP: A Computing Pipeline for Genomic Prediction and Variance Component Estimation Using Haplotypes and SNP Markers
title_fullStr	GVCHAP: A Computing Pipeline for Genomic Prediction and Variance Component Estimation Using Haplotypes and SNP Markers
title_full_unstemmed	GVCHAP: A Computing Pipeline for Genomic Prediction and Variance Component Estimation Using Haplotypes and SNP Markers
title_short	GVCHAP: A Computing Pipeline for Genomic Prediction and Variance Component Estimation Using Haplotypes and SNP Markers
title_sort	gvchap a computing pipeline for genomic prediction and variance component estimation using haplotypes and snp markers
topic	genomic selection haplotype SNP heritability prediction accuracy
url	https://www.frontiersin.org/article/10.3389/fgene.2020.00282/full
work_keys_str_mv	AT dzianisprakapenka gvchapacomputingpipelineforgenomicpredictionandvariancecomponentestimationusinghaplotypesandsnpmarkers AT chunkaowang gvchapacomputingpipelineforgenomicpredictionandvariancecomponentestimationusinghaplotypesandsnpmarkers AT zuoxiangliang gvchapacomputingpipelineforgenomicpredictionandvariancecomponentestimationusinghaplotypesandsnpmarkers AT chengbian gvchapacomputingpipelineforgenomicpredictionandvariancecomponentestimationusinghaplotypesandsnpmarkers AT chengbian gvchapacomputingpipelineforgenomicpredictionandvariancecomponentestimationusinghaplotypesandsnpmarkers AT chengtan gvchapacomputingpipelineforgenomicpredictionandvariancecomponentestimationusinghaplotypesandsnpmarkers AT chengtan gvchapacomputingpipelineforgenomicpredictionandvariancecomponentestimationusinghaplotypesandsnpmarkers AT yangda gvchapacomputingpipelineforgenomicpredictionandvariancecomponentestimationusinghaplotypesandsnpmarkers

GVCHAP: A Computing Pipeline for Genomic Prediction and Variance Component Estimation Using Haplotypes and SNP Markers

Similar Items