Using genetic distance to infer the accuracy of genomic prediction

The prediction of phenotypic traits using high-density genomic data has many applications such as the selection of plants and animals of commercial interest; and it is expected to play an increasing role in medical diagnostics. Statistical models used for this task are usually tested using cross-val...

Full description

Bibliographic Details
Main Authors:	Scutari, M, Mackay, I, Balding, D
Format:	Journal article
Published:	Public Library of Science 2016

_version_	1826260918898524160
author	Scutari, M Mackay, I Balding, D
author_facet	Scutari, M Mackay, I Balding, D
author_sort	Scutari, M
collection	OXFORD
description	The prediction of phenotypic traits using high-density genomic data has many applications such as the selection of plants and animals of commercial interest; and it is expected to play an increasing role in medical diagnostics. Statistical models used for this task are usually tested using cross-validation, which implicitly assumes that new individuals (whose phenotypes we would like to predict) originate from the same population the genomic prediction model is trained on. In this paper we propose an approach based on clustering and resampling to investigate the effect of increasing genetic distance between training and target populations when predicting quantitative traits. This is important for plant and animal genetics, where genomic selection programs rely on the precision of predictions in future rounds of breeding. Therefore, estimating how quickly predictive accuracy decays is important in deciding which training population to use and how often the model has to be recalibrated. We find that the correlation between true and predicted values decays approximately linearly with respect to either FST or mean kinship between the training and the target populations. We illustrate this relationship using simulations and a collection of data sets from mice, wheat and human genetics.
first_indexed	2024-03-06T19:13:21Z
format	Journal article
id	oxford-uuid:1786ebbb-731d-49ef-9583-f004057b36cc
institution	University of Oxford
last_indexed	2024-03-06T19:13:21Z
publishDate	2016
publisher	Public Library of Science
record_format	dspace
spelling	oxford-uuid:1786ebbb-731d-49ef-9583-f004057b36cc2022-03-26T10:37:48ZUsing genetic distance to infer the accuracy of genomic predictionJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:1786ebbb-731d-49ef-9583-f004057b36ccSymplectic Elements at OxfordPublic Library of Science2016Scutari, MMackay, IBalding, DThe prediction of phenotypic traits using high-density genomic data has many applications such as the selection of plants and animals of commercial interest; and it is expected to play an increasing role in medical diagnostics. Statistical models used for this task are usually tested using cross-validation, which implicitly assumes that new individuals (whose phenotypes we would like to predict) originate from the same population the genomic prediction model is trained on. In this paper we propose an approach based on clustering and resampling to investigate the effect of increasing genetic distance between training and target populations when predicting quantitative traits. This is important for plant and animal genetics, where genomic selection programs rely on the precision of predictions in future rounds of breeding. Therefore, estimating how quickly predictive accuracy decays is important in deciding which training population to use and how often the model has to be recalibrated. We find that the correlation between true and predicted values decays approximately linearly with respect to either FST or mean kinship between the training and the target populations. We illustrate this relationship using simulations and a collection of data sets from mice, wheat and human genetics.
spellingShingle	Scutari, M Mackay, I Balding, D Using genetic distance to infer the accuracy of genomic prediction
title	Using genetic distance to infer the accuracy of genomic prediction
title_full	Using genetic distance to infer the accuracy of genomic prediction
title_fullStr	Using genetic distance to infer the accuracy of genomic prediction
title_full_unstemmed	Using genetic distance to infer the accuracy of genomic prediction
title_short	Using genetic distance to infer the accuracy of genomic prediction
title_sort	using genetic distance to infer the accuracy of genomic prediction
work_keys_str_mv	AT scutarim usinggeneticdistancetoinfertheaccuracyofgenomicprediction AT mackayi usinggeneticdistancetoinfertheaccuracyofgenomicprediction AT baldingd usinggeneticdistancetoinfertheaccuracyofgenomicprediction

Using genetic distance to infer the accuracy of genomic prediction

Similar Items