Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populations

Abstract Background Whole-genome sequence (WGS) data could contain information on genetic variants at or in high linkage disequilibrium with causative mutations that underlie the genetic variation of polygenic traits. Thus far, genomic prediction accuracy has shown limited increase when using such i...

Full description

Bibliographic Details
Main Authors: Nasir Moghaddar, Majid Khansefid, Julius H. J. van der Werf, Sunduimijid Bolormaa, Naomi Duijvesteijn, Samuel A. Clark, Andrew A. Swan, Hans D. Daetwyler, Iona M. MacLeod
Format: Article
Language:deu
Published: BMC 2019-12-01
Series:Genetics Selection Evolution
Online Access:https://doi.org/10.1186/s12711-019-0514-2
_version_ 1818722745165283328
author Nasir Moghaddar
Majid Khansefid
Julius H. J. van der Werf
Sunduimijid Bolormaa
Naomi Duijvesteijn
Samuel A. Clark
Andrew A. Swan
Hans D. Daetwyler
Iona M. MacLeod
author_facet Nasir Moghaddar
Majid Khansefid
Julius H. J. van der Werf
Sunduimijid Bolormaa
Naomi Duijvesteijn
Samuel A. Clark
Andrew A. Swan
Hans D. Daetwyler
Iona M. MacLeod
author_sort Nasir Moghaddar
collection DOAJ
description Abstract Background Whole-genome sequence (WGS) data could contain information on genetic variants at or in high linkage disequilibrium with causative mutations that underlie the genetic variation of polygenic traits. Thus far, genomic prediction accuracy has shown limited increase when using such information in dairy cattle studies, in which one or few breeds with limited diversity predominate. The objective of our study was to evaluate the accuracy of genomic prediction in a multi-breed Australian sheep population of relatively less related target individuals, when using information on imputed WGS genotypes. Methods Between 9626 and 26,657 animals with phenotypes were available for nine economically important sheep production traits and all had WGS imputed genotypes. About 30% of the data were used to discover predictive single nucleotide polymorphism (SNPs) based on a genome-wide association study (GWAS) and the remaining data were used for training and validation of genomic prediction. Prediction accuracy using selected variants from imputed sequence data was compared to that using a standard array of 50k SNP genotypes, thereby comparing genomic best linear prediction (GBLUP) and Bayesian methods (BayesR/BayesRC). Accuracy of genomic prediction was evaluated in two independent populations that were each lowly related to the training set, one being purebred Merino and the other crossbred Border Leicester x Merino sheep. Results A substantial improvement in prediction accuracy was observed when selected sequence variants were fitted alongside 50k genotypes as a separate variance component in GBLUP (2GBLUP) or in Bayesian analysis as a separate category of SNPs (BayesRC). From an average accuracy of 0.27 in both validation sets for the 50k array, the average absolute increase in accuracy across traits with 2GBLUP was 0.083 and 0.073 for purebred and crossbred animals, respectively, whereas with BayesRC it was 0.102 and 0.087. The average gain in accuracy was smaller when selected sequence variants were treated in the same category as 50k SNPs. Very little improvement over 50k prediction was observed when using all WGS variants. Conclusions Accuracy of genomic prediction in diverse sheep populations increased substantially by using variants selected from whole-genome sequence data based on an independent multi-breed GWAS, when compared to genomic prediction using standard 50K genotypes.
first_indexed 2024-12-17T20:59:30Z
format Article
id doaj.art-aa5b8f4ffd054680ba2e8ca6d4534b54
institution Directory Open Access Journal
issn 1297-9686
language deu
last_indexed 2024-12-17T20:59:30Z
publishDate 2019-12-01
publisher BMC
record_format Article
series Genetics Selection Evolution
spelling doaj.art-aa5b8f4ffd054680ba2e8ca6d4534b542022-12-21T21:32:46ZdeuBMCGenetics Selection Evolution1297-96862019-12-0151111410.1186/s12711-019-0514-2Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populationsNasir Moghaddar0Majid Khansefid1Julius H. J. van der Werf2Sunduimijid Bolormaa3Naomi Duijvesteijn4Samuel A. Clark5Andrew A. Swan6Hans D. Daetwyler7Iona M. MacLeod8Sheep-CRCSheep-CRCSheep-CRCSheep-CRCSheep-CRCSheep-CRCSheep-CRCSheep-CRCSheep-CRCAbstract Background Whole-genome sequence (WGS) data could contain information on genetic variants at or in high linkage disequilibrium with causative mutations that underlie the genetic variation of polygenic traits. Thus far, genomic prediction accuracy has shown limited increase when using such information in dairy cattle studies, in which one or few breeds with limited diversity predominate. The objective of our study was to evaluate the accuracy of genomic prediction in a multi-breed Australian sheep population of relatively less related target individuals, when using information on imputed WGS genotypes. Methods Between 9626 and 26,657 animals with phenotypes were available for nine economically important sheep production traits and all had WGS imputed genotypes. About 30% of the data were used to discover predictive single nucleotide polymorphism (SNPs) based on a genome-wide association study (GWAS) and the remaining data were used for training and validation of genomic prediction. Prediction accuracy using selected variants from imputed sequence data was compared to that using a standard array of 50k SNP genotypes, thereby comparing genomic best linear prediction (GBLUP) and Bayesian methods (BayesR/BayesRC). Accuracy of genomic prediction was evaluated in two independent populations that were each lowly related to the training set, one being purebred Merino and the other crossbred Border Leicester x Merino sheep. Results A substantial improvement in prediction accuracy was observed when selected sequence variants were fitted alongside 50k genotypes as a separate variance component in GBLUP (2GBLUP) or in Bayesian analysis as a separate category of SNPs (BayesRC). From an average accuracy of 0.27 in both validation sets for the 50k array, the average absolute increase in accuracy across traits with 2GBLUP was 0.083 and 0.073 for purebred and crossbred animals, respectively, whereas with BayesRC it was 0.102 and 0.087. The average gain in accuracy was smaller when selected sequence variants were treated in the same category as 50k SNPs. Very little improvement over 50k prediction was observed when using all WGS variants. Conclusions Accuracy of genomic prediction in diverse sheep populations increased substantially by using variants selected from whole-genome sequence data based on an independent multi-breed GWAS, when compared to genomic prediction using standard 50K genotypes.https://doi.org/10.1186/s12711-019-0514-2
spellingShingle Nasir Moghaddar
Majid Khansefid
Julius H. J. van der Werf
Sunduimijid Bolormaa
Naomi Duijvesteijn
Samuel A. Clark
Andrew A. Swan
Hans D. Daetwyler
Iona M. MacLeod
Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populations
Genetics Selection Evolution
title Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populations
title_full Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populations
title_fullStr Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populations
title_full_unstemmed Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populations
title_short Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populations
title_sort genomic prediction based on selected variants from imputed whole genome sequence data in australian sheep populations
url https://doi.org/10.1186/s12711-019-0514-2
work_keys_str_mv AT nasirmoghaddar genomicpredictionbasedonselectedvariantsfromimputedwholegenomesequencedatainaustraliansheeppopulations
AT majidkhansefid genomicpredictionbasedonselectedvariantsfromimputedwholegenomesequencedatainaustraliansheeppopulations
AT juliushjvanderwerf genomicpredictionbasedonselectedvariantsfromimputedwholegenomesequencedatainaustraliansheeppopulations
AT sunduimijidbolormaa genomicpredictionbasedonselectedvariantsfromimputedwholegenomesequencedatainaustraliansheeppopulations
AT naomiduijvesteijn genomicpredictionbasedonselectedvariantsfromimputedwholegenomesequencedatainaustraliansheeppopulations
AT samuelaclark genomicpredictionbasedonselectedvariantsfromimputedwholegenomesequencedatainaustraliansheeppopulations
AT andrewaswan genomicpredictionbasedonselectedvariantsfromimputedwholegenomesequencedatainaustraliansheeppopulations
AT hansddaetwyler genomicpredictionbasedonselectedvariantsfromimputedwholegenomesequencedatainaustraliansheeppopulations
AT ionammacleod genomicpredictionbasedonselectedvariantsfromimputedwholegenomesequencedatainaustraliansheeppopulations