A function accounting for training set size and marker density to model the average accuracy of genomic prediction.

Prediction of genomic breeding values is of major practical relevance in dairy cattle breeding. Deterministic equations have been suggested to predict the accuracy of genomic breeding values in a given design which are based on training set size, reliability of phenotypes, and the number of independ...

Full description

Bibliographic Details
Main Authors: Malena Erbe, Birgit Gredler, Franz Reinhold Seefried, Beat Bapst, Henner Simianer
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2013-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC3855218?pdf=render
_version_ 1819113871961489408
author Malena Erbe
Birgit Gredler
Franz Reinhold Seefried
Beat Bapst
Henner Simianer
author_facet Malena Erbe
Birgit Gredler
Franz Reinhold Seefried
Beat Bapst
Henner Simianer
author_sort Malena Erbe
collection DOAJ
description Prediction of genomic breeding values is of major practical relevance in dairy cattle breeding. Deterministic equations have been suggested to predict the accuracy of genomic breeding values in a given design which are based on training set size, reliability of phenotypes, and the number of independent chromosome segments ([Formula: see text]). The aim of our study was to find a general deterministic equation for the average accuracy of genomic breeding values that also accounts for marker density and can be fitted empirically. Two data sets of 5'698 Holstein Friesian bulls genotyped with 50 K SNPs and 1'332 Brown Swiss bulls genotyped with 50 K SNPs and imputed to ∼600 K SNPs were available. Different k-fold (k = 2-10, 15, 20) cross-validation scenarios (50 replicates, random assignment) were performed using a genomic BLUP approach. A maximum likelihood approach was used to estimate the parameters of different prediction equations. The highest likelihood was obtained when using a modified form of the deterministic equation of Daetwyler et al. (2010), augmented by a weighting factor (w) based on the assumption that the maximum achievable accuracy is [Formula: see text]. The proportion of genetic variance captured by the complete SNP sets ([Formula: see text]) was 0.76 to 0.82 for Holstein Friesian and 0.72 to 0.75 for Brown Swiss. When modifying the number of SNPs, w was found to be proportional to the log of the marker density up to a limit which is population and trait specific and was found to be reached with ∼20'000 SNPs in the Brown Swiss population studied.
first_indexed 2024-12-22T04:36:18Z
format Article
id doaj.art-a8840cdc845a4bccb53296bfc484ee28
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-22T04:36:18Z
publishDate 2013-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-a8840cdc845a4bccb53296bfc484ee282022-12-21T18:38:53ZengPublic Library of Science (PLoS)PLoS ONE1932-62032013-01-01812e8104610.1371/journal.pone.0081046A function accounting for training set size and marker density to model the average accuracy of genomic prediction.Malena ErbeBirgit GredlerFranz Reinhold SeefriedBeat BapstHenner SimianerPrediction of genomic breeding values is of major practical relevance in dairy cattle breeding. Deterministic equations have been suggested to predict the accuracy of genomic breeding values in a given design which are based on training set size, reliability of phenotypes, and the number of independent chromosome segments ([Formula: see text]). The aim of our study was to find a general deterministic equation for the average accuracy of genomic breeding values that also accounts for marker density and can be fitted empirically. Two data sets of 5'698 Holstein Friesian bulls genotyped with 50 K SNPs and 1'332 Brown Swiss bulls genotyped with 50 K SNPs and imputed to ∼600 K SNPs were available. Different k-fold (k = 2-10, 15, 20) cross-validation scenarios (50 replicates, random assignment) were performed using a genomic BLUP approach. A maximum likelihood approach was used to estimate the parameters of different prediction equations. The highest likelihood was obtained when using a modified form of the deterministic equation of Daetwyler et al. (2010), augmented by a weighting factor (w) based on the assumption that the maximum achievable accuracy is [Formula: see text]. The proportion of genetic variance captured by the complete SNP sets ([Formula: see text]) was 0.76 to 0.82 for Holstein Friesian and 0.72 to 0.75 for Brown Swiss. When modifying the number of SNPs, w was found to be proportional to the log of the marker density up to a limit which is population and trait specific and was found to be reached with ∼20'000 SNPs in the Brown Swiss population studied.http://europepmc.org/articles/PMC3855218?pdf=render
spellingShingle Malena Erbe
Birgit Gredler
Franz Reinhold Seefried
Beat Bapst
Henner Simianer
A function accounting for training set size and marker density to model the average accuracy of genomic prediction.
PLoS ONE
title A function accounting for training set size and marker density to model the average accuracy of genomic prediction.
title_full A function accounting for training set size and marker density to model the average accuracy of genomic prediction.
title_fullStr A function accounting for training set size and marker density to model the average accuracy of genomic prediction.
title_full_unstemmed A function accounting for training set size and marker density to model the average accuracy of genomic prediction.
title_short A function accounting for training set size and marker density to model the average accuracy of genomic prediction.
title_sort function accounting for training set size and marker density to model the average accuracy of genomic prediction
url http://europepmc.org/articles/PMC3855218?pdf=render
work_keys_str_mv AT malenaerbe afunctionaccountingfortrainingsetsizeandmarkerdensitytomodeltheaverageaccuracyofgenomicprediction
AT birgitgredler afunctionaccountingfortrainingsetsizeandmarkerdensitytomodeltheaverageaccuracyofgenomicprediction
AT franzreinholdseefried afunctionaccountingfortrainingsetsizeandmarkerdensitytomodeltheaverageaccuracyofgenomicprediction
AT beatbapst afunctionaccountingfortrainingsetsizeandmarkerdensitytomodeltheaverageaccuracyofgenomicprediction
AT hennersimianer afunctionaccountingfortrainingsetsizeandmarkerdensitytomodeltheaverageaccuracyofgenomicprediction
AT malenaerbe functionaccountingfortrainingsetsizeandmarkerdensitytomodeltheaverageaccuracyofgenomicprediction
AT birgitgredler functionaccountingfortrainingsetsizeandmarkerdensitytomodeltheaverageaccuracyofgenomicprediction
AT franzreinholdseefried functionaccountingfortrainingsetsizeandmarkerdensitytomodeltheaverageaccuracyofgenomicprediction
AT beatbapst functionaccountingfortrainingsetsizeandmarkerdensitytomodeltheaverageaccuracyofgenomicprediction
AT hennersimianer functionaccountingfortrainingsetsizeandmarkerdensitytomodeltheaverageaccuracyofgenomicprediction